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DYNAMIC  PRICING  CRITERIA  IN  LINEAR  PROGRAMMING 


Edward  S.  Klotz 

Department  of  Operations  Research,  Stanford  University 


/ 


^  In  recent  years  the  interest  in /linear  programming  algorithms  has  increased 
greatly  due  to  the  discovery  of  neW  interior-point  methods.  New  results  have  also 
prompted  researchers  to  reconsider  some  previously  discarded  ideas  in  light  of  their 
additional  knowledge.  This  thesis  begins  with  a  study  of  some  variants  of  the 
reduced-gradient  method  applied  to  linear  programs.  Preliminary  computational 
tests  revealed  how  sparsity  and  degeneracy,  two  characteristics  present  in  most 
practical  problems,  can  severely  inhibit  such  variants.  The  development  of  dynamic 
pricing  criteria  to  exclude  certain  columns  from  the  search  direction  provides  a  com 
putationally  efficient  way  to  alleviate  these  difficulties.  Application  to  the  simplex 
method  yields  a  pivot  rule  d  .signed  to  avoid  degenerate  pivots.  Generalization  of 
the  rule  yields  a  cheap  method  to  estimate  the  step  lengths  associated  with  potential 
incoming  nonbasic  variables.  The  result  is  a  set  of  pivot  rules  that  appear  partic¬ 
ularly  useful  on  highly  degenerate  and  poorly  scaled  linear  programs.  Extensive 
computational  tests  are  presented 

Key  words:  linear  programming,  simplex  method,  reduced-gradient  method,  pric¬ 
ing,  degeneracy. 
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CHAPTER  1:  INTRODUCTION 


After  more  them  40  years,  Dantzig’s  simplex  algorithm  remains  the  most  pop¬ 
ular  method  for  solving  linear  programs.  Since  its  invention  in  1947,  researchers 
have  proposed  many  other  algorithms,  yet  none  of  these  has  consistently  outper¬ 
formed  the  simplex  algorithm.  In  addition,  many  different  pivot  rules  (see  [2],  [17], 
[18],  [19],  [43],  [52]  and  [53])  have  arisen  for  the  method,  yet  most  implementations 
continue  to  use  Dantzig’s  original  pivot  rule. 

In  1984  Karmarkar  [23]  proposed  a  projective  algorithm  for  linear  programming 
substantially  different  from  the  simplex  method.  Karmarkar’s  algorithm  moves 
through  the  interior  of  the  feasible  region,  while  the  simplex  method  traverses  a 
sequence  of  feasible  vertices.  While  it  remains  unclear  if  this  new  algorithm  will 
ultimately  replace  the  simplex  method  for  solving  practical  problems  (refer  to  [1] 
and  [26]),  the  approach  has  profoundly  influenced  the  direction  of  research  in  lin¬ 
ear  programming.  Other  algorithms  have  arisen  (see  [4],  [42],  [49]  and  [54])  that 
are  strongly  motivated  by  the  projective  method.  In  addition,  it  has  prompted 
researchers  to  reexamine  previously  discarded  ideas  in  light  of  their  new  knowl¬ 
edge  (see  [15]).  The  research  described  in  this  thesis  began  by  adopting  the  latter 
approach,  experimenting  with  variants  of  the  reduced-gradient  method  applied  to 
linear  programming.  Although  these  experiments  did  not  yield  an  algorithm  supe¬ 
rior  to  Dantzig’s  method,  they  provided  insight  into  the  major  drawbacks  of  these 
variants.  The  experiments  also  inspired  a  set  of  dynamic  pricing  criteria  that  shows 
great  promise  to  improve  the  simplex  method. 

Chapters  2,  3  and  4  comprise  the  majority  of  the  thesis.  Chapter  2  discusses 
a  set  of  feasible  direction  methods  for  linear  programming  similar  to  the  reduced- 
gradient  method.  After  a  brief  description,  we  examine  how  degeneracy  can  inhibit 
the  progress  of  such  algorithms.  This  obstacle  motivates  the  development  of  pricing 
criteria  to  avoid  degeneracy.  The  idea  also  generalizes  to  deal  with  “near”  degener¬ 
acy.  The  result  is  a  modified  feasible  direction  method.  We  present  computational 
results  for  the  method  on  some  practical  problems.  The  algorithm  usually  out¬ 
performed  the  simplex  method  with  respect  to  iterations,  but  it  always  required 
more  time.  Nonetheless,  the  results  reveal  an  improvement  over  previous  tests  of 
similar  methods.  Pricing  criteria  to  avoid  degeneracy  provide  the  main  source  of 
improvement. 

Chapter  3  considers  the  use  of  such  pricing  criteria  in  the  context  of  the  simplex 
method.  We  first  apply  the  results  of  Chapter  2.  We  then  develop  additional 
criteria  designed  specifically  to  improve  the  simplex  method.  The  result  is  a  set 
of  very  promising  multiple-priority  pivot  rules.  One  can  view  such  procedures  as 
computationally  inexpensive  attempts  to  estimate  the  step  length  associated  with 
a  potential  basic  variable.  Although  fairly  cheap,  these  techniques  do  increase  the 
computational  effort.  We  therefore  develop  results  designed  to  reduce  the  extra 
work.  In  addition,  we  study  a  parametric  simplex  method  due  to  Gass  and  Saaty 
that  Dantzig  recommends  in  [7]  to  avoid  cycling.  Although  this  approach  usually 


performs  well  on  highly  degenerate  problems,  it  does  so  without  making  any  effort 
to  avoid  degenerate  pivots.  This  leads  to  a  pivoting  procedure  that  combines  the 
parametric  method’s  selection  rule  with  a  multiple-priority  pivot  rule  designed  to 
avoid  degenerate  pivots.  The  chapter  concludes  with  some  results  designed  to  reduce 
the  computational  burden  associated  with  the  previously  described  rules. 

Chapter  4  presents  extensive  computational  tests  of  the  pivot  rules  described. 
Each  rule  requires  fewer  iterations  than  the  simplex  method  on  most  problems. 
However,  the  reduction  in  iterations  does  not  always  result  in  a  reduction  in  com¬ 
putation  time.  Nonetheless,  the  rules  typically  perform  quite  well  on  the  larger, 
more  difficult  problems. 

Chapter  5  contains  a  discussion  of  some  untested  ideas  that  may  be  fruitful 
areas  for  future  research.  Extensions  of  the  ideas  developed  here,  along  with  appli¬ 
cations  to  nonlinear  programming,  are  considered. 

Chapter  6  summarizes  the  thesis  and  provides  some  conclusions.  The  author 
hopes  that  the  reader  will  have  acquired  some  additional  tools  in  his  arsenal  for 
solving  linear  programs,  as  well  as  some  new  insight  on  some  previously  established 
results. 

An  appendix  follows  the  thesis.  It  describes  details  pertinent  to  the  computa¬ 
tional  experiments;  for  example,  the  ideas  described  in  Chapters  2  and  3  had  to  be 
generalized  to  deal  with  bounded  variables.  The  appendix  contains  no  new  math¬ 
ematics,  but  it  highlights  significant  differences  between  the  theory  and  practice  of 
linear  programming. 

1.1.  Notation 

This  thesis  deals  primarily  with  linear  programming.  We  begin  by  establishing 
som<-  pertinent.  preliminaries  and  notation.  The  author  assumes  that  the  reader  is 
familiar  with  standard  linear  programming  theory;  if  not,  refer  to  [6]  or  [35].  For 
details  on  the  computational  aspects  of  linear  programming,  see  [14],  [16],  [31]  and 
[46].  We  consider  the  following  standard  form  linear  programming  problem: 

minimize  cTx 

subject  to  Ax  =  6  (1.1) 

x  >  0, 

where  A  6  Rmxn,  c  €  Rn,  x  G  J?n,  and  b  6  Rm.  Without  loss  of  generality,  assume 
that  m  <  n  and  A  has  full  rank. 

The  dual  of  (1.1)  is: 

maximize  bTir 

subject  to  ATn  +  v  =  c  (1.2) 

u  >  0. 

We  will  often  wish  to  consider  submatrices  of  A.  Consider  I  =  {*,, . . . ,  i*}  C 
{ 1 ,  - . . ,  w}  and  J  =  C  Then 
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represents  the  submatrix  of  A  consisting  of  rows  indexed  by  {t i*}  and  all 
columns.  Similarly, 

Aj 

represents  the  submatrix  of  A  consisting  of  columns  indexed  by  {j t, . . . ,  j,}  and  all 
rows.  Also, 

Ai,j 

consists  of  the  submatrix  of  A  with  rows  indexed  by  I  and  columns  indexed  by  J. 
This  notation  simplifies  slightly  for  vectors;  Cj  denotes  the  components  of  c  indexed 
by  J. 

Let  B  be  an  m  x  m  nonsingular  submatrix  of  A.  B  is  called  a  basis.  Such  a  B 
exists  because  A  has  full  rank.  When  appropriate,  we  shall  also  use  B  =  . . . ,  jm } 

to  index  the  corresponding  columns  of  A;  thus  B  and  A.B  represent  the  same  matrix. 
Similarly,  let  N  index  the  set  of  nonbasic  columns  of  .4.  Given  a  basis  B,  the  linear 
program  (1.1)  has  the  following  equivalent  canonical  form: 

minimize 

subject  to  xB  4-  A.NxN  =  b  (1.3) 

xN  >  0 

where  A  =  B-1A,  5  =  B~xb,  and  =  crN  —  irTA ,N.  The  m-vector  n  comprises  the 
basic  dual  variables  and  is  defined  by  the  linear  system 

*  tB  =  ctb. 

The  linear  programs  (1.1)  and  (1.3)  provide  the  framework  for  the  ideas  developed 
herein.  Additional  notation  shall  be  defined  as  the  need  arises. 


CHAPTER  2:  FEASIBLE  DIRECTION  METHODS 


2.1.  Preliminaries 

Wolfe  proposed  the  reduced-gradient  method  (see  [50])  for  linearly  constrained 
optimization  in  1962.  While  primarily  intended  for  nonlinear  objective  functions, 
the  algorithm  can  also  solve  linear  programs.  Since  its  invention,  several  authors 
(see  (5j,  [8],  [9],  [10],  [12],  [21],  [221,  [36],  [37],  [401  and  [45])  have  described  similar 
algorithms  for  linear  programs.  One  can  view  these  approaches  as  extensions  of 
the  simplex  method  since  they  utilize  the  notions  of  basic  and  nonbasic  variables 
yet  allow  for  feasible  iterates  other  than  vertices.  Computational  results  in  the 
literature  usually  deal  only  with  random  problems;  one  exception  consists  of  tests 
performed  by  Kallio  and  Orchard-Hays  122].  The  purpose  of  this  chapter  is  to 
broaden  our  understanding  of  such  algorithms.  We  begin  with  a  general  description 
of  the  approach,  provided  in  Sections  2. 2-2.4.  Section  2.5  discusses  the  effects  of 
degeneracy  on  such  algorithms,  while  Section  2.6  develops  techniques  to  deal  with 
this  problem.  Section  2.7  then  proposes  a  modified  feasible  direction  method  based 
on  the  results  of  Sections  2. 2-2.6.  Computational  results  on  a  set  of  practical  test 
problems  are  presented  in  Section  2.8. 

2.2.  The  Choice  of  Search  Direction 

Consider  the  linear  programs  (1.1)  and  (1.3).  Assume  a  feasible  solution  x  and 
a  basis  B.  Unlike  in  the  simplex  method,  x  here  need  not  be  a  vertex,  and  B  need 
not  be  a  feasible  basis.  We  wish  to  choose  a  search  direction  q  6  Rn  such  that 
Aq  =  0  and  cTq  <  0.  Choose  a  set  P  C  N  of  promising  variables  to  change.  For 
j  €  P,  set  6j  as  the  rate  of  change  of  x,  \  let  6j  =  0  for  j  £  P.  We  will  elaborate  on 
how  to  determine  6j  and  P  later.  Now,  define  the  following  search  direction  q: 

qP  =  6P 

Qn\p  =  0  (2-1) 

qB  =  —A.PqP. 

Note  that  Aq  =  Bqa  +  A.PqP  +  A.N\PqN\P  =  0.  For  any  suboptimal  x  we  will  see  in 
Section  2.3  that  there  always  exists  a  set  P  and  rate  of  change  6P  such  that  cT q  <  0. 
Therefore,  q  provides  a  descent  direction.  Observe  that  if  B  is  a  feasible  basis  and 
x  is  the  associated  vertex,  then  one  can  view  the  simplex  method  as  a  special  case 
by  choosing  a  single  promising  variable  with  negative  reduced  cost  Cj  and  setting 
6j  =  I- 

Given  q,  we  wish  to  move  to  a  new  solution 

x  =  x  +  6q  (2.2) 

for  any  nonnegative  scalar  9.  Since  q  lies  in  the  null  space  of  A ,  it  follows  that 
Ax  =6,  so  we  wish  to  choose  9  to  maximize  the  improvement  in  the  objective 
function  while  ensuring  that  1  >  0.  Observe  from  (2.1)  and  (2.2)  that 


Since  xN^P  =  xN\P  >  0,  only  variables  in  P  and  B  may  violate  their  nonnegativity 
requirements.  First,  consider  the  variables  in  P.  For  j  6  P, 

Xj  +  9qj  >  0  &  6qj  >  —x  j 

&  B  <  -  —  for  j:  qj  <  0 

&  B  <  min  -  — .  (2.4) 

J:9i  <0  qj 

The  same  logic  applies  to  the  basic  variables.  For  j  £  B , 

x,  +  Bqi  >  0  &  B  <  min - 3~.  (2.5) 

3  1  ~  r<h  <o  qj 

In  order  for  x  to  remain  nonnegative,  9  must  satisfy  (2.4)  and  (2.5)  simultaneously. 
In  other  words, 

B  —  min  — — .  (2.6) 

j€BUP:qj  <0  qj 

Recall  that  in  the  simplex  method  only  a  single  nonbasic  variable  increases,  so  (2.4) 
is  always  true,  and  (2.6)  simplifies  to  the  usual  ratio  test  on  the  basic  variables. 

(2.6)  guarantees  that  x  =  x  +  Bq  >  0.  Since  Ax  =  b,  3f  is  a  feasible  solution  for  the 
linear  programs  (1.1)  and  (1.3).  We  now  have  defined  a  procedure  to  determine  a 
search  direction  q  and  an  associated  step  length  B. 

The  algorithm  is  incomplete  since  the  rules  governing  changes  in  the  basis  and 
determination  of  promising  variables  and  their  rates  of  change  remain  undefined. 
Sections  2.3  and  2.4  examine  those  aspects  in  detail.  Meanwhile,  we  extend  the 
search  procedure  to  allow  for  a  second  search  direction  q.  The  algorithm  will  then 
use  a  linear  combination  of  q  and  q  to  compute  the  next  feasible  iterate.  We  define 
q  by  viewing  the  constraints  of  the  linear  program  (1.3)  in  the  following  equivalent 
form: 

A.PxP  <  b  —  A.N\Px N\P 

xn\pi  xp  ^  0.  (2-7) 

(2.7)  considers  the  algorithm  from  the  perspective  of  the  space  of  the  promising 
variables  xP.  The  basic  variables  xB  serve  as  slack  variables.  Observe  that  in  this 
space,  movement  from  x  to  x  implies  movement  in  search  direction  q  until  at  least 
one  of  the  hyperplanes  defining  the  constraints  is  reached.  In  other  words,  once 
we  determine  a  search  direction  q,  suppose  that  we  move  from  the  current  feasible 
solution  x  in  the  direction  q  as  fax  as  possible.  The  step  length  B  increases  until  a 
slack  variable  attains  zero;  (2.6)  determines  the  size  of  the  step  length  and  identifies 
one  or  more  such  variables.  A  hyperplane  corresponds  to  each  slack  driven  to  zero, 
and  we  shall  utilize  such  a  hyperplane  to  determine  a  second  search  direction.  In 
order  to  identify  a  tight  constraint,  consider  the  ratio  tests  in  (2.6).  Let  j*  index  a 
variable  achieving  the  minimum  ratio,  i.e. 

x  ■ 

j*  =  argmin - -. 

j€BuP:q,<0  <1] 

Two  cases  arise.  If  j*  £  P,  then  x;.  =  0.  Let  rP  index  the  component  of  P 
containing  j*  ;  P(rp)  —  j *.  Also,  let  e1rp  £  P|P|  represent  the  rPth  unit  row 
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vector.  In  this  case  a  tight  hyperplane  corresponds  to  the  nonnegativity  constraint 
erPxP  ^  0  (or>  equivalently,  Xj •  >  0)  found  in  (2.7).  The  second  case  occurs  when 
j*  £  B.  Id.  this  case  suppose  the  rBth  component  of  B  contains  j*.  Then  a  tight 
hyperplane  corresponds  to  the  constraint  ArBiP  xF  <  bTB—  A.N\PxN\P.  Summarizing 
the  two  cases,  let  aP  represent  the  normal  to  a  tight  hyperplane  determined  bj  the 
ratio  tests  in  (2.6).  Then, 


— T 

a  P  = 


rp 


if  jmeP; 


ArB,p  if  j*  £  B. 


(2.8) 


Given  oP,  project  qP,  the  promising  components  of  the  first  search  direction,  onto 
the  hyperplane  a£zP  =  0  in  order  to  generate  a  second  direction  qP.  Recall  that 
projection  onto  the  null  space  of  al  is  equivalent  to  projection  onto  the  orthogonal 
complement  of  the  row  space  of  of.  Projecting  onto  the  row  space  of  a£  involves 
the  projection  matrix 

Projection  onto  the  orthogonal  complement  of  the  row  space  of  3£  involves  the 
projection  matrix 

aPa£ 


Ml 


Hence, 


=  PriiJ  *' 

((Kqp)^ 


(2.9) 


Notice  that  if  j *  €  P,  &P  is  a  unit  vector,  and  the  computation  of  qP  requires 
very  little  extra  work: 

qp  =  qp  —  erp  qj» . 


However,  if  j*  £  B,  then 

qP  =  qP  —  ( 


(e?BB  lA.PqP) 


el.B-'A. 


r» 


P  lb 


r)( 


e-„B 


-1 


A.P)T. 


In  this  case  one  must  solve  the  linear  system  wT  B  =  e*g  and  then  compute  inner 
products  between  w  and  A.j  for  each  j  £  P.  These  computations  require  significant 
extra  work. 

Given  qP ,  we  ensure  that  q  lies  in  the  null  space  of  A  by  setting 


q^yP  —  0  and 

qB  =  —A.PqP. 


(2.10) 
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Given  a  feasible  solution  x,  we  move  to  a  new  solution 
i  =  (iB,  xP,  xN\P) 

=  ( xB  -  0A.PqP  -  fxA.PqP,xP  +  0qP  +  nqP,xNKP).  (2.11) 

Since  q  lies  in  the  null  space  of  A,  Ax  =  b  for  all  values  of  0  and  fx.  We  wish 
to  choose  0  and  \x  so  that  x  satisfies  the  nonnegativity  requirement  while  bringing 
about  the  largest  possible  change  in  the  objective  function.  With  a  single  search 
direction  this  occurs  when  0  is  chosen  as  large  as  possible  subject  to  the  constraints 
specified  by  the  ratio  tests  in  (2.4)  and  (2.5);  one  can  view  this  as  a  one-column 
linear  program  with  solution  given  by  (2.6).  Finding  the  best  combination  of  two 
search  directions  requires  the  solution  of  a  two-column  linear  program: 

minimize  (?PqP)0  4-  (cTpqP)n 

subject  to  (— qP)0  +  (— qP)n  <  xP  (2.12) 

( A.PqP)0  4-  ( A.PqP)fj,  <  xB. 

Remember  that  x  is  a  feasible  solution,  so  0  and  fx  are  the  only  variables  involved 
in  (2.12).  The  objective  function  in  this  linear  program  measures  the  change  in 
the  objective  of  the  linear  program  (1.1)  achieved  by  moving  from  x  to  x.  The 
first  set  of  constraints  correspond  to  tne  nonnegativity  requirements  on  xP  implied 
by  (2.11),  while  the  second  set  corresponds  to  the  analogous  requirement  for  xs. 
Therefore,  x  is  feasible  for  the  linear  programs  (1.1)  and  (1.3). 

We  have  just  described  a  procedure  that  defines  a  two-dimensional  search  plane 
instead  of  the  usual  one- dimensional  search  line.  The  use  of  a  search  plane  requires 
significant  additional  computation,  but  it  offers  an  extra  dimension  that  can  help 
the  algorithm  avoid  getting  stuck  in  a  long  sequence  of  iterations  with  minimal 
progress.  One  can  solve  two- variable  linear  programs  quickly.  One  could  implement 
a  specialized  simplex  method  that  exploits  the  fact  that  the  constraint  matrix  of  the 
linear  program  (2.12)  consists  of  two  columns  plus  an  identity  matrix.  Therefore, 
during  any  iteration,  a  basis  contains  at  most  two  columns  other  than  unit  vectors, 
which  should  reduce  the  work  involved  in  basis  factorizations  and  updates.  A  second 
approach  consists  of  using  a  special  linear-time  algorithm  for  two-variable  linear 
programming  (see  [30]).  The  latter  method  was  adopted  here. 

2.3.  The  Choice  of  Promising  Variables 

Section  2.2  described  how  to  change  a  feasible  solution  x  to  a  new  feasible 
solution  x  (or  x,  if  one  uses  two  search  directions).  This  section  investigates  the 
determination  of  P,  the  index  set  of  promising  variables.  We  wish  to  choose  P  so 
that  a  decrease  in  the  objective  function  will  always  result  from  the  change  in  x. 
In  the  literature  the  most  frequently  encountered  selection  rule  consists  of  selecting 
each  nonbasic  variable  that  would  individually  decrease  the  objective  function.  In 
other  words, 

j  €  P  if  Zj  <  0,  Xj  >  0  or 

Zj>  0,  x j  >  0.  (2.13) 

We  then  set  qj  =  —c}  for  j  <=-.  P.  (2.1)  now  precisely  specifies  the  search  direction 
q.  Note  that  because  x  need  not  be  a  vertex,  the  objective  function  will  improve  by 
decreasing  a  positive  nonbasic  variable  with  positive  reduced  cost;  such  variables 


do  not  exist  in  the  simplex  method.  In  Section  2.7  we  will  modify  (2.13),  but  the 
change  will  not  affect  any  of  the  ideas  discussed  in  the  remainder  of  this  section. 

Notice  that  the  simplex  method’s  one-to-one  correspondence  between  bases 
and  feasible  solutions  no  longer  exists.  In  fact,  one  can  associate  a  feasible  so¬ 
lution  x  with  any  basis,  even  an  infeasible  one.  Since  the  values  of  the  reduced 
costs  depend  on  the  choice  of  basis,  so  does  the  determination  of  P.  Given  x,  a 
particular  nonbasic  variable  could  increase  for  one  associated  basis  yet  decrease  for 
another.  Nonetheless,  the  following  lemma  shows  that  q  remains  a  descent  direction 
regardless  of  the  choice  of  basis. 

Lemma  1.  Given  a  feasible  solution  x  and  any  basis  B  for  the  linear  program 
(1.1),  suppose  one  defines  q  as  follows: 

qP  =  —cP 

W  =  0  (2.14) 

qB  ~  — A.PqP  —  A.PcP. 

Then  cTq  <  0. 

Proof.  We  prove  the  lemma  by  partitioning  cT  and  q  by  B,  P,  and  N  \  P: 

cT q  —  cTBqB  -f-  c^,qP  4-  cTN^pqN\P 
=  cgA.PZP  cpZP  4-  0 
=  CgB~1  A.PcP  -  cpcP 


=  (irTA.P  -  cp)ZP 

=  -IMi  <o.  • 

Lemma  1  implies  that  a  feasible  descent  direction  always  exists  if  the  current 
solution  is  not  optimal.  Note  that  although  the  resulting  second  direction  q  need  not 
guarantee  descent,  the  optimal  solution  (6*,p*)  to  the  two-column  linear  program 
(2.12)  yields  the  descent  direction  9*q  4-  p*q  used  to  determine  x  in  (2.11).  This  is 
true  Decause  one  can  set  p  =  0  in  (2.12)  and  use  only  q  as  a  search  direction. 

The  loosening  of  the  relationship  between  bases  and  feasible  solutions  discussed 
here  raises  a  similar  ambiguity  with  respect  to  optimality  of  a  solution.  One  would 
hope  that  if  the  set  P  is  empty  during  a  particular  iteration,  then  the  current 
feasible  solution  is  in  fact  optimal.  Lemma  2  reveals  this  to  be  true  regardless  of 
the  associated  basis. 

Lemma  2.  Given  a  feasible  solution  x  and  a  basis  B  for  the  linear  program  (LI), 
if  P  is  empty,  then  x  is  optimal. 

Proof.  We  prove  the  lemma  by  utilizing  the  complementary  slackness  conditions, 
which  are  necessary  and  sufficient  for  optimality.  First,  note  that  for  j  6  B,  Cj  =  0 
and  Xj  >  0.  Furthermore,  since  P  is  empty,  its  definition  (2.13)  implies  that  for 
j  €  N,  Xj  =  0  if  Zj  >  0,  and  Zj  =  0  if  Xj  >  0.  We  conclude  that  tne  primal  feasible 
solution  x  and  the  dual  feasible  solution  (n,Z  )  satisfy  the  complementary  slackness 
conditions.  • 

Lemma  2  provides  a  simple  termination  criterion  for  the  algorithm.  Notice 
that,  unlike  the  simplex  method,  the  resulting  optimal  solution  need  not  be  a  vertex. 


When  solving  practical  linear  programs,  the  modeller  frequently  wishes  to  perform 
sensitivity  analysis.  In  order  to  do  so,  the  optimal  basis  must  correspond  directly 
with  the  optimal  solution.  Unlike  the  simplex  method,  the  algorithm  described 
here  does  not  guarantee  this  correspondence.  Fortunately,  one  can  design  a  simple 
procedure  to  move  from  an  optimal  solution  to  an  optimal  vertex.  For  details  refer 
to  the  appendix. 

2.4.  Pivoting  Strategies 

After  defining  the  set  of  promising  variables  and  moving  to  a  new  feasible  solu¬ 
tion,  all  that  remains  undefined  in  an  iteration  is  a  pivoting  procedure  to  change  the 
basis.  For  expository  purposes  assume  the  algorithm  uses  a  single  search  direction 
q ;  the  procedure  is  very  similar  for  two  search  directions.  Many  different  pivoting 
strategies  are  available.  For  the  computational  tests  in  Section  2.8,  the  pivot  rule 
depends  on  the  value  of  the  step  length  9  defined  by  (2.6). 

Case  1:  Suppose  that  9  >  0.  Given  the  precise  definition  of  q  in  (2.14),  we 
specify  the  ratio  test  in  (2.6): 

9  =  min  — -  =  min  (  min  — ,  min  —  =-£ — >.  (2.15) 

J:9i< 0  qj  y.j£P:'Zj>0  i-.Ai,pcp< 0  Aitp£p  ' 

Recall  that  1  is  the  current  feasible  solution  resulting  from  the  iterative  step  (2.2). 
Let  s  index  the  entering  basic  variable;  choose  the  largest  promising  variable  to 
enter  the  basis: 

s  =  argmaxTj.  (2.16) 

j€P 

Other  criteria  were  also  tested,  but  (2.16)  emerged  from  the  experiments  as  the  best 
one. 

Selection  of  the  outgoing  variable  depends  on  the  sign  of  the  reduced  cost  c3. 
If  <  0,  we  wish  to  increase  the  incoming  variable.  However,  if  the  ratio  test  in 
(2.15)  results  in  a  basic  variable  reaching  zero,  we  also  wish  to  remove  that  variable 
from  the  basis.  The  following  procedure  always  achieves  at  least  one  of  these  goals. 
As  in  Section  2.2,  let  rB  be  a  component  of  B  containing  a  basic  variable  driven 
to  zero  by  the  basic  variable  ratio  test  in  (2.15);  jTB  indexes  the  corresponding 
variable.  Define  A.a  =  B~1A.t  as  the  representation  of  the  incoming  column  with 
respect  to  the  current  basis.  If  Argil  >  0,  then  select  jrg  as  the  outgoing  basic 
variable  and  change  the  basis.  Otherwise,  select  the  outgoing  variable  by  the  usual 
simplex  method  ratio  test: 

r  =  argmin  =^-.  (2.17) 

Choose  the  rth  basic  variable  jr  to  leave  the  basis.  Associated  with  r  is  6,  the 
increase  in  the  entering  variable: 

9  =  min  =^~.  (2.18) 

i:.4j,,>0  Aifl 

Note  that  if  Airt  <  0,  the  algorithm  terminates  with  an  unbounded  solution.  If  not, 
update  the  current  feasible  solution: 


Note  that  the  updated  solution  remains  feasible.  Now,  replace  the  rth  column  of 
the  basis  with  A.,.  At  this  point  a  few  observations  are  in  order.  First  of  all,  note 
that  if  rB  exists  and  "XrB,$  >  0,  then  rB  is  an  argmin  of  the  standard  ratio  test 

(2.17),  and  the  corresponding  increase  d  is  zero.  Note  also  that  if  no  basic  variable 
hits  zero  in  the  ratio  test  (2.15),  then  no  such  index  rB  exists;  proceed  immediately 
with  the  ratio  test  (2.17).  This  completes  the  pivot  rule  when  c,  <  0. 

If  ct  >  0,  we  wish  to  decrease  the  incoming  variable  as  much  as  possible 
without  violating  its  nonnegativity  requirement.  One  must  also  ensure  that  the 
basic  variables  remain  nonnegative.  In  this  case,  if  rB  exists  and  ArB<s  <  0,  then 
the  basic  variable  leaves  the  basis;  do  not  perform  a  ratio  test.  Otherwise, 
perform  the  following  ratio  test  to  preserve  nonnegativity  of  the  basic  variables: 


r  =  argmin  — _ 

i:X,.<0 


(2.19) 


The  difference  between  (2.17)  and  (2.19)  occurs  because  the  entering  variable  de¬ 
creases  when  Zt  <  0.  In  addition,  define 


£  =  min  —  =A-, 
<:3i,i<0 


(2.20) 


6  is  the  largest  possible  decrease  in  the  entering  variable  that  preserves  nonnegativity 
of  the  basic  variables.  Remember  that  one  must  also  guarantee  that  the  decrease 


of  the  basic  variables.  Remember  that  one  must  also  guarantee  that  the  decrease 
in  the  entering  variable  does  not  violate  its  nonnegativity  constraint.  Observe  that 
this  implies  that  the  algorithm  never  terminates  with  an  unbounded  solution  when 
cs  <  0.  If  £  >  the  entering  variable  reaches  zero  without  driving  any  of  the  basic 
variables  below  zero.  In  this  case  avoid  the  pivot  and  merely  update  the  present 
solution: 

<-0 


arfl  d"  A.t3?f. 


If  0  <  xs,  then  a  pivot  is  necessary.  Update  af  as  follows: 


r,  <-  j,  -  6 

xB  *—fB  +  A.,6. 


Replace  A.jr  with  A.,  as  the  rth  basic  column.  This  completes  the  pivoting  proce¬ 
dure  when  5,  >  0. 

Case  2:  Suppose  that  B,  the  step  length  in  (2.15),  is  zero.  Observe  that 
promising  variables  with  negative  reduced  costs  cannot  directly  bound  6.  Since 
Xj  >  0  if  j  6  P  and  >  0,  one  sees  that  among  the  promising  variables  the 
minimum  ratio  in  (2.15)  must  be  positive.  It  follows  that  B  =  0  only  if  at  least  one 
basic  variable  jr  is  zero  and  the  corresponding  component  in  the  search  direction 
qjr  =  Ar  PZP  <  0.  Experimentation  with  a  set  of  12  practical  problems  suggested 
tne  beneiit  of  removing  such  variables  from  the  basis.  Accordingly,  a  procedure  to 
ensure  removal  was  adopted.  Define 


5  =  {j  €  P  :  <  0,  Arj  >0  or 

>  0,  Af'j  <  0}. 


ft? 


Since  ArPCp  <  0,  we  know  that  3  j  €  P  :  Zj  <  0  and  Arj  >  0,  or  c}  >0  and 
Arj  <  0.  Hence,  |S|  >  1.  In  other  words,  there  exists  at  least  one  promising 

variable  whose  entrance  into  the  basis  will  remove  the  rth  basic  variable.  5  indexes 
all  such  variables.  We  now  select  the  largest  nonbasic  variable  in  S  to  enter  the 
basis: 

a  =  argmaxT,. 

>es 

One  can  improve  the  stability  of  the  algorithm  by  modifying  the  rule  to  ensure  a  suf¬ 
ficiently  large  pivot  element.  Murtagh  and  Saunders  [32]  recommend  the  following 
modification.  Define  _ 

a  =  max  \ArJ\, 
j€S 

and  choose  _ 

s  =  argmaxfaTj  :  \Ar  >  0.1a}. 

;€S 

Observe  that  if  ca  <  0,  then  ~Ar,»  >  0  since  s  G  S;  hence  we  see  from  the  standard 
ratio  test  (2.17)  that  the  rth  basic  variable  is  eligible  to  leave  the  basis.  Similarly, 
the  ratio  test  (2.19)  assures  eligibility  if  7j  >  0.  In  each  case  #,  the  change  in  the 
incoming  variable,  is  zero,  so  no  change  in  the  current  feasible  solution  occurs.  The 
pivoting  procedure  when  6  =  0  is  now  complete. 

We  have  now  accounted  for  all  possible  values  of  6.  The  pivot  rule  is  complete. 
We  can  now  update  the  basis  and  proceed  with  the  next  iteration. 

2.5.  Effects  of  Degeneracy 

The  previous  three  sections  provided  a  general  description  of  a  type  of  feasi¬ 
ble  direction  method  for  linear  programming.  Some  flexibility  exists  with  respect 
to  selection  of  promising  variables,  pivoting  procedure,  and  other  details,  but  the 
fundamental  approach  remains  unchanged. 

Very  little  computational  testing  of  this  type  of  algorithm  exists  in  the  liter¬ 
ature.  Nonetheless,  several  publications  remain  noteworthy.  In  1979  Cooper  and 
Kennington  [5]  discussed  linear  programming  algorithms: 

We  find  it  curious  that  the  literature  contains  so  few  papers  concerning 
other  algorithms  for  such  an  important  class  of  problems.  We  assume  ei¬ 
ther  (i)  other  ideas  have  been  investigated,  abandoned,  and  never  reported, 
or  (ii)  the  simplex  method  has  proved  so  effective  that  other  investigators 
felt  no  motivation  to  work  in  this  area. 

They  went  on  to  propose  algorithms  within  the  class  described  here.  No  compu¬ 
tational  t'  -ting  was  performed.  Sherali,  Soyster,  and  Baines  [45]  tested  a  similar 
algorithm  on  a  set  of  random  problems.  They  remarked: 

Computationally,  this  method  turned  out  to  be  substantially  inferior 
to  the  simplex  method  . . .  One  may  expect  in  this  instance  that  after  some 
rapid  initial  improvements,  the  reduced  gradient  procedure  goes  through 
many  more  insignificant  iterations.  This  was  not  the  case  . . .  What  appears 
to  happen  is  that  instead  of  jumping  along  the  simplex  path,  and  hence 
rendering  itself  advantageous,  the  procedure  zigzags  across  the  simplex 
path,  resulting  in  several  more  iterations. 


Eiselt  and  Sandblom  [8]  also  report  discouraging  results  for  a  similar  approach: 

The  intended  “shooting  through  polytopes”  in  our  study  resulted  in 
many  problems,  most  prominently  numerical  instability  and  convergence 
problems.  On  that  basis,  the  method  was  referred  to  as  “crawling  and 
stalling”  and  work  on  it  was  discontinued. 

Eiselt  and  Sandblom  altered  their  approach  to  allow  for  the  notion  of  external 
pivoting.  With  this  modification  they  reported  encouraging  computational  results 
on  a  set  of  random  problems.  Meanwhile,  Kallio  and  Orchard-Hays  [22]  tested  a 
reduced-gradient  method  on  a  set  of  non-trivial  practical  problems.  They  restricted 
the  set  of  promising  variables  to  at  most  seven,  regardless  of  problem  size.  They 
also  utilized  a  multiple  pricing  procedure  in  an  attempt  to  reduce  the  average  work 
per  iteration.  Nonetheless,  they  too  found  their  approach  required  more  iterations 
than  the  simplex  method  on  most  of  their  test  problems. 

The  initial  tactic  adopted  here  was  to  utilize  the  two-dimensional  search  method 
described  in  Section  2.2.  It  was  hoped  that  this  method  would  alleviate  the  difficul¬ 
ties  described  by  these  authors.  However,  the  flexibility  provided  by  an  additional 
direction  proved  insufficient  to  make  the  approach  competitive  with  the  simplex 
method.  Iteration  counts  ranged  from  1.3  to  2.1  times  those  of  the  simplex  method 
on  the  12  practical  problems  tested.  CPU  times  were  not  even  considered  since  each 
iteration  requires  significant  extra  work  compared  to  a  simplex  iteration. 

The  reader  may  find  these  results  surprising;  intuitively  one  might  anticipate 
that  moving  through  the  feasible  region  instead  of  around  it  would  provide  a  sub¬ 
stantial  advantage  over  the  simplex  method.  However,  detailed  examination  of  these 
algorithms  reveals  an  explanation  for  the  poor  performance  on  practical  problems. 

Sparsity  and  degeneracy  are  two  characteristics  of  practical  problems  that  are 
absent  from  most  randomly  generated  problems.  Although  one  can  generate  sparse, 
random  problems,  they  still  lack  the  sparsity  patterns  characteristic  of  real  prob¬ 
lems.  Sparsity  and  degeneracy  inhibit  the  performance  of  the  algorithms  of  Sections 
2. 2-2.4.  In  particular,  values  of  zero  in  the  ratio  test  that  determines  the  step  length 
occur  much  more  frequently  than  in  the  simplex  method.  To  see  why,  recall  the 
ratio  tests  involved  to  determine  the  step  lengths.  If  we  use  a  single  search  direction, 
define  P  by  (2.13),  and  set  q}  =  —  7;  for  j  €  P,  then  the  step  length  8  is  bounded 
above  (see  (2.6)  and  (2.15))  by  the  following  minimum  ratio: 

8<  _  min  —  **.  •  (2.21) 

•:Ai,p(-cp)>0  Ai'P\—CP) 

On  the  other  hand,  a  slightly  different  ratio  test  bounds  the  step  length  in  the 
simplex  method: 

8<  min  =^-,  (2.22) 

Note  that  the  simplex  method  uses  a  single  column  A.t  in  the  ratio  test,  while 
the  feasible  direction  method  uses  a  linear  combination  of  many  such  individual 
columns.  Why  is  this  difference  significant?  On  problems  tested  here  the  canonical 
columns  A.j  typically  remained  sparse,  albeit  not  to  the  extent  of  the  corresponding 
original  columns  A.j.  Taking  a  linear  combination  of  many  such  sparse  columns  as 

in  the  ratio  test  (2.21)  increases  the  density  of  the  composite  column  A.P(  —  cP). 
This  occurs  regardless  of  the  choice  of  search  direction  if  |P|  is  substantial.  The 
result,  as  illustrated  by  Figure  2-1,  is  that  the  dense  composite  column  contains 


Figure  2-1.  Degeneracy  and  Linear  Programming  Algorithms 

Feasible  Direction  Method 
A.p(-Cp)  Xg 


Simplex  Method 


A.j 

XB 

/°\ 

/+\ 

0 

+ 

— 

0 

+ 

+ 

0 

0 

0 

0 

0 

+ 

0 

+ 

+ 

+ 

0 

+ 

0 

0 

0 

+ 

\-J 

\0  / 

(0>O) 

many  positive  components,  each  of  which  is  eligible  in  the  ratio  test.  Contrast  this 
with  the  simplex  method,  where  usually  only  a  few  positive  components  exist. 

Given  the  presence  of  degeneracy  in  the  basic  variables,  one  sees  that  the 
minimum  ratio  of  (2.21)  equals  zero  with  greater  likelihood  than  that  of  (2.22). 
This  also  explains  why  the  simplex  method  can  frequently  solve  degenerate  linear 
programs  without  performing  too  many  degenerate  pivots;  the  degenerate  basic 
variables  frequently  correspond  to  non-positive  components  of  the  column  in  the 
ratio  test.  The  same  cannot  be  said  for  the  algorithms  in  Sections  2. 2-2.4. 

This  kind  of  difficulty  need  not  arise  only  in  the  presence  of  degeneracy.  As 
long  as  the  nonbasic  columns  A.j  remain  sparse,  feasible  direction  methods  will 
frequently  be  restricted  by  the  step  length  associated  with  the  worst  promising 
column.  To  see  this,  consider  the  following  small  numerical  example: 

/  1  \  /ON  /ON  /1\ 

2-102 
-2  _  i  _  0  1 

A.j  —  0  A.  2  =  — 1  A.  3  =  0  xB  =  4 

0  0  0  2 

0  Ole 

VO/  \0/  \8/  \8/ 

*(..*.»>=  (-1.-1. -j)-  (2-23) 

In  this  example  P  =  {1,2,3},  and  0  <  e  <  1.  In  the  simplex  method  the  step 
lengths  associated  with  entering  columns  1,2,  and  3  into  the  basis  are  1,1,  and  e, 
respectively.  The  corresponding  improvements  in  objective  function  are  1,1,  and 
j.  Since  e  may  be  arbitrarily  small,  we  see  that  columns  1  and  2  are  good  choices. 


while  column  3  is  a  poor  one.  Fortunately,  the  standard  simplex  method  would 
choose  1  or  2  to  enter  the  basis  in  this  situation.  Given  the  choice  of  P,  we  see  that 


Ap(—ZP)  — 


From  the  ratio  test  (2.21)  it  follows  that  the  corresponding  step  length  cannot  exceed 
2e.  Similarly,  since  c^,qP  =  —  c£cP  =  —  j,  the  resulting  improvement  in  the  objective 
function  is  at  most  |e.  Thus,  for  c  <  the  feasible  direct  -'on  algorithm  yields  a 
smaller  improvement  than  the  simplex  method.  Note,  however,  that  if  P  =  {1,2}, 
then 


A.p^—Zp)  — 


The  resulting  step  length  is  now  1  and  the  improvement  in  the  objective  function  is  2, 
so  the  feasible  direction  approach  outperforms  the  simplex  method.  The  important 
observation  here  is  that  A.3  is  complementary  to  A.\  and  A.2.  Therefore,  A.3 
limits  the  feasible  direction  step  length  regardless  of  the  benefit  of  other  promising 
columns.  Because  of  sparsity,  this  situation  occurs  frequently  in  practical  problems, 
and  it  creates  a  significant  obstacle  for  any  of  the  algorithms  previously  described 
herein.  Of  course,  the  simplex  method,  which  is  a  feasible  direction  method  that 
selects  only  one  promising  variable,  may  also  choose  poorly.  However,  the  point 
here  is  that  it  takes  only  one  bad  column  to  inhibit  the  step  length;  an  algorithm 
selects  such  a  column  more  frequently  when  it  chooses  many  promising  variables 
instead  of  one. 

How  can  we  overcome  these  difficulties?  Observe  that  in  the  numerical  example 
the  feasible  direction  method  progresses  nicely  when  we  exclude  column  3  from  the 
set  of  promising  variables.  This  suggests  the  benefit  of  screening  out  certain  bad 
columns  as  unpromising,  even  though  they  satisfy  the  previous  promising  criterion 
(2.13).  In  the  next  section  we  shall  develop  some  computationally  inexpensive 
techniques  to  do  so. 

2.6.  Dynamic  Pricing 

Section  2.5  demonstrated  how  a  single  column  with  a  small  step  length  can 
inhibit  progress  of  the  algorithms  of  Sections  2. 2-2. 4.  We  now  develop  methods  to 
screen  out  such  columns  in  a  computationally  inexpensive  way. 

Let  us  begin  by  considering  columns  with  zero  step  lengths.  Define 


6j  —  min 

i:Aij>0  Aij 


j£N. 


(2.24) 
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dj  is  the  step  length  that  would  result  from  increasing  the  variable.  For  ex¬ 
pository  purposes  we  shall  consider  the  case  of  promising  variables  with  negative 
reduced  costs;  similar  logic  applies  for  those  with  positive  reduced  costs.  For  a 
variable  with  a  zero  step  length,  9j  =  0,  so  from  (2.24)  one  sees  that  at  least  one 

basic  variable  equals  zero  and  corresponds  to  a  positive  component  of  A.j  in  the 
ratio  test.  How  can  one  detect  such  columns  in  advance?  For  each  j  €  P,  one  could 
explicitly  determine  9j  and  exclude  any  columns  for  which  6}  =  0.  Unfortunately, 

this  would  require  the  representation  A.P  of  the  columns  A.P  in  terms  of  the  cur¬ 
rent  basis.  In  other  words,  one  must  solve  |P|  systems  of  linear  equations  of  the 
form  By  =  A.j,  a  prohibitively  expensive  task.  A  much  cheaper  approach  consists 
of  defining  a  second  objective  function  that  measures  the  degeneracy  of  the  basic 
variables  in  a  way  that  yields  valuable  information  on  how  to  exclude  bad  columns. 
Since  the  basic  variables  change  during  iterations  when  the  objective  improves,  this 
second  objective  function  changes  during  the  course  of  the  algorithm.  In  particular, 
at  the  start  of  a  given  iteration,  define  d  €  Rn  so  that  dj  —  0  for  j  £  N,  and 


f  1  if  x  ji  =  0, 
1 0  otherwise 


(2.25) 


for  i  =  l,...,m.  Thus,  d  identifies  the  degenerate  basic  variables.  For  j  G  F 
compute  the  quantity  dj  =  d% A.j.  The  following  lemma  provides  a  simple  column 
screening  criterion. 

Lemma  3.  If  dj  >  0,  then  9j  =  0. 

Proof.  A  closer  look  at  dj  proves  the  lemma.  Observe  from  the  definition  of  dB 
in  (2.25)  that 

3j=  E 

In  other  words,  dj  consists  of  the  sum  of  components  of  A.j  that  correspond  to  a 
degenerate  basic  variable  in  the  ratio  test  in  (2.24).  Therefore,  if  dj  >  0, 


3  t*  :  Ai'j  >  0,  Xj..  =  0. 

Note  that  i*  is  eligible  for  the  ratio  test  (2.24),  so 

0  <0j<  =  0, 

■Aj*  j 

which  establishes  the  desired  result.  • 

Thus,  if  dj  >0,  one  knows  in  advance  that  x}  results  in  a  zero  step  length  if 
it  becomes  basic.  Furthermore,  we  shall  see  how  to  compute  dj  efficiently.  We  now 

have  a  method  to  exclude  columns  from  P.  Note  that  dj  <  0  does  not  necessarily 
imply  that  9j  >  0;  consider 
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We  now  show  that  the  computation  of  dj  requires  essentially  the  same  amount 
of  work  as  computing  the  reduced  costs  Cj.  One  does  not  compute  dj  from  the 
expression  <PBA.j\  this  would  involve  solving  for  each  A.j,  j  6  P.  Instead,  observe 
that  '  _  __ 

d,  =  d^A.j  =  (diB-lAj. 

Let  <rT  =  <FaB~l .  Generate  a  by  solving  the  linear  system 

<JTB  =  (Tb.  (2.26) 

Solving  for  a  is  analogous  to  solving  for  tt,  the  dual  variables.  Now,  compute 
dj  =  <jt A.j.  Computing  dj  for  all  j  6  P  requires  only  one  additional  solve,  as 
opposed  to  the  |P|  additional  solves  needed  to  examine  individual  components  of 
each  A.j.  More  generally,  one  can  generate  any  linear  combination  of  the  rows  of  the 

matrix  A.P  with  one  additional  solve,  but  examination  of  individual  components  of 
each  column  of  that  matrix  requires  |P|  extra  solves.  The  key,  of  course,  consists 
of  choosing  a  linear  combination  that  yields  valuable  information,  as  in  (2.25). 

Let  us  now  extend  this  notion  to  deal  with  small  values  of  dj.  In  (2.25)  an 
indicator  function  determined  the  value  of  d}i.  In  general  one  can  set 

dU  =  /(*;.)  *  =  (2.27) 

for  any  specified  function  /.  Lemma  4  proposes  some  beneficial  choices  of  /. 
Lemma  4.  Given  r  >  0,  suppose  that  dj,  =  f(xj{)  where 

f(xji)  >0  ifxjt  <  t, 

f(xji)  =  0  xji  >  r-  (2.28) 


Then 


dj  >  0  =»  dj  < 


T 


min 
•  Z,,;-  >° 


Proof.  Lemma  3  is  a  special  case  of  Lemma  4,  so  we  utilize  a  similar  strategy  to 
prove  the  result.  Given  the  definition  of  /, 


dj  =  ^  >  0  =►  3  t*  :  Ai»j  >  0  and  Xj.m  <  r. 

,:x><  <r 


The  basic  variable  indexed  by  i*  is  eligible  for  the  ratio  test  in  (2.24),  so 

o  < dj  <  4^-  <  — ——  <  — .r  _  .  • 

j  in i n  min  Ai j' 

i  .Af  j>0  i:Ai'j>0 

Lemma  4  provides  a  way  to  screen  out  columns  with  small  step  lengths.  Again, 
the  procedure  requires  only  one  additional  solve.  The  particular  choices  of  /  and  r 
are  important.  One  can  specify  a  piecewise  linear  function  to  satisfy  the  condition 


(2.28)  of  Lemma  4.  For  example,  define  x  =  x}i  as  the  average  of  the  basic 

variables,  and  let  a  represent  a  positive  scalar.  Set 

4  =  /(**)  =  [l  -  -^j  •  (2.29) 

Consider  the  plot  of  dji  against  Xj.. 


(X 


As  in  (2.25),  we  again  set  dji  =  1  if  Xji  =  0,  but  we  now  utilize  the  values  of  x 
and  a  to  account  for  relatively  small  basic  variables. 

For  an  illustration,  set  a  =  1  and  reconsider  example  (2.23)  with  e  =  .1.  One 

sees  that  d3  >  0,  resulting  in  the  exclusion  of  column  3  from  P.  The  feasible 
direction  method  then  achieves  a  greater  improvement  in  the  objective  function 
than  the  simplex  method.  One  can  also  construct  examples  of  good  columns  failing 
the  screening  test  and  bad  columns  passing  it.  Nonetheless,  this  choice  of  /  has 
proven  useful  in  practice. 

We  now  have  developed  an  inexpensive  way  to  evaluate  further  the  promise  of 
potential  entering  variables.  In  the  next  section  we  shall  use  these  ideas  to  propose 
a  modified  feasible  direction  algorithm.  Later  we  will  extend  these  ideas  and  apply 
them  to  the  simplex  method. 

2.7.  A  Modified  Feasible  Direction  Method 

The  results  of  Sections  2.5  and  2.6  suggest  a  slightly  different  feasible  direction 
method.  In  particular,  in  light  of  Lemmas  3  and  4,  define  P*  C  P  as  follows: 

j  €  P*  if  <  0,  Xj  >  0,  dj  <  0  or 

Cj  >  0,  Xj  >  0,  dj  >  0.  (2.30) 

In  other  words,  a  variable  remains  promising  only  if  it  passes  one  of  the  screening 
criteria  of  Section  2.6.  Notice  that  for  positive  variables  in  P ,  negative  values  of 


Figure  2-2.  A  Modified  Feasible  Direction  Method 
Given:  a  basis  B  and  a  feasible  solution  x  for  the  linear  program  (1.1). 

1.  Determine  the  sets  P  and  Pm  by  (2.13)  and  (2.30). 

2.  If  P  =  0,  go  to  10. 

3.  If  P*  ±  0,  set  P  =  P*. 

4.  Determine  the  search  direction  q:  qF  =  —  cP,  qN^P  =0,  qB  =  A.PcP. 

5.  Determine  the  step  length  6  : 


8  =  min  — 

jeBuP:qj<0  qj 

6.  Move  to  a  new  feasible  solution  x  =  x  +  6q. 

7.  Determine  the  incoming  and  outgoing  basic  variables  by  the  rules  of  Section 
2.4.  If  the  procedure  reveals  an  unbounded  solution,  go  to  11. 

8.  Update  the  basis. 

9.  Go  to  1. 

10.  The  current  solution  is  optimal. 

11.  Terminate  the  algorithm. 


dj  identify  excluded  columns.  Recall  that  when  a  variable  decreases,  the  negative 
components  of  A.j  become  eligible  for  the  ratio  test  (2.20).  This  change  in  sign 
explains  the  different  interpretation  of  dj  when  a  variable  decreases. 

We  now  substitute  P*  for  P  in  the  previously  described  algorithms.  The  results 
remain  unchanged  except  for  the  termination  criterion.  Emptiness  of  P*  need  not 
imply  optimality  of  the  current  solution.  Emptiness  of  P  remains  as  the  stopping 
rule.  Figure  2-2  summarizes  the  modified  algorithm  using  a  single  search  direc¬ 
tion.  The  changes  do  not  affect  the  theory  of  the  reduced-gradient  method,  so  the 
algorithm  attains  an  optimal  solution  if  one  exists. 

Computational  tests  revealed  the  effectiveness  of  the  column  screening  tech¬ 
niques  of  Section  2.6.  Different  choices  of  the  second  objective  function  d  were 
examined.  Each  one  substantially  reduced  the  iteration  counts.  The  best  approach 
discovered  so  far  used  the  piecewise  linear  function  in  (2.29)  with  a  =  10.  The  use 
of  a  second  search  direction  q  provided  only  a  marginal  reduction  in  iterations  that 
failed  to  compensate  for  the  extra  work  involved.  Screening  out  “bad”  columns 
emerged  as  the  most  important  enhancement  to  the  performance  of  this  type  of 
algorithm. 


2.8.  Computational  Results 

We  now  examine  some  computational  results.  The  test  set  consists  of  12  small 
to  moderately  sized  practical  problems  available  from  the  Systems  Optimization 
Laboratory  at  Stamford  University.  MINOS  5.1,  a  linear  and  nonlinear  optimiza¬ 
tion  code  developed  by  Murtagh  and  Saunders  (see  [33]  and  [34]),  figured  promi¬ 
nently  in  the  testing.  The  algorithm  of  Figure  2-2  was  implemented  by  modifying 
the  appropriate  subroutines  of  MINOS.  MINOS  also  provided  the  simplex  method 
used  in  the  comparisons.  Identical  subroutines  performed  many  of  the  common 
aspects  of  each  algorithm,  including  input  of  the  problems,  basis  factorization,  and 
solution  of  linear  systems  of  equations.  Thus,  one  can  attribute  distinctions  in  per¬ 
formance  to  different  characteristics  of  the  algorithms,  instead  of  inconsistencies  in 
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Figure  2-3.  Results  for  Feasible  Direction  Method  on  Unsealed  Problems 
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0.91 
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ADLITTLE 
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0.87 
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13.43 

1.58 
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139 
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75.34 

40.50 

1.86 
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1.00 
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60.59 

2.19 
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273 

0.82 

226.55 

134.18 

1.69 
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274 

0.76 

440.90 

251.21 

1.76 

ETAMACRO 

261 

335 

0.78 

268.83 

180.22 

1.49 

E226 

410 

493 

0.83 

255.39 

137.79 

1.85 

Geom.  mean: 

0.91 

Geom.  mean: 

1.62 

the  implementations. 

Figure  2-3  summarizes  the  experiments.  We  compare  both  iterations  and  CPU 
time.  Iterations  pertain  to  Phase  II  only;  Phase  I  of  the  simplex  method  generated 
the  same  feasible  vertex  for  the  feasible  direction  and  simplex  methods.  Time 
comparisons  measure  solution  of  the  whole  problem.  In  each  case  we  compute  the 
ratio  of  computational  effort  required  by  the  feasible  direction  method  to  that  of  the 
simplex  method.  Ratios  less  than  1.0  identify  superior  performance  by  the  feasible 
direction  method.  At  the  bottom  of  the  table  we  compute  the  geometric  mean  of 
the  12  ratios.  Assuming  all  problems  are  equally  important,  this  measures  relative 
performance  accumulated  over  all  test  problems.  With  respect  to  iterations,  the 
algorithm  achieved  moderate  success.  Most  problems  required  less  iterations  than 
the  simplex  method;  only  BRANDY  needed  substantially  more.  However,  each 
iteration  involves  significant  additional  work,  and  the  reduction  in  iterations  failed 
to  compensate  for  this.  CPU  times  exceeded  those  of  the  simplex  method  for  every 
problem. 

What  conclusions  can  one  draw  from  these  results?  The  pricing  techniques 
of  Section  2.6  improve  the  algorithm  noticeably.  Iterations  are  usually  fewer  than 
those  of  the  simplex  method.  This  marks  a  significant  improvement  over  the  results 
of  previous  testing  of  this  type  of  algorithm.  Nonetheless,  the  approach  still  fails 
to  compete  with  the  simplex  method  on  the  basis  of  computation  time,  the  most 
important  indicator.  The  flexibility  of  the  search  direction  and  pivot  rules  suggests 
that  more  successful  variants  may  exist.  However,  the  research  reported  here  reveals 
certain  inherent  difficulties  with  the  approach.  Column  screening  is  beneficial,  but 
the  procedures  defined  are  not  flawless;  columns  with  small  step  lengths  can  pass  the 
screening  test  and  inhibit  the  step  length.  The  likelihood  of  such  columns  slipping 
through  and  being  included  amongst  the  promising  variables  is  greater  when  |P| 
is  large  than  when  |Pj  =  1,  as  in  the  simplex  method.  This  fact  remains  true 
regardless  of  the  particular  choice  of  search  direction  or  pivoting  strategy.  In  order 
to  succeed,  any  member  of  this  class  of  algorithms  must  contain  features  designed 
to  evade  this  obstacle. 


I 

I 

I 

I 


CHAPTER  3:  MULTIPLE- OBJECTIVE  PIVOT  RULES 
IN  THE  SIMPLEX  METHOD 


3.1.  Preliminaries 

The  results  of  Chapter  2  suggest  the  potential  of  applying  a  two-objective 
approach  to  the  simplex  method.  The  simplex  method  tends  to  perform  poorly  on 
highly  degenerate  linear  programs,  so  the  ability  to  avoid  degenerate  pivots  may  be 
quite  useful.  Section  3.2  utilizes  the  results  of  Chapter  2  to  formulate  pivot  rules 
for  the  simplex  method.  Section  3.3  then  extends  these  ideas.  Instead  of  trying 
to  exclude  certain  variables,  we  investigate  the  use  of  a  second  objective  function 
to  make  good  selections.  Two  more  pivot  rules  arise.  Further  examination  reveals 
that  these  procedures  attempt  to  estimate  inexpensively  the  step  length  associated 
with  a  potential  entering  variable. 

Section  3.4  examines  a  parametric  variant  of  the  simplex  method  which  has 
performed  well  on  highly  degenerate  test  problems.  The  variant  resembles  the  other 
pivot  rules  of  this  chapter  because  it  also  utilizes  a  second  objective  function,  albeit 
one  that  remains  unchanged  throughout  the  algorithm.  The  similarities  motivate 
a  new  parametric  algorithm  that  incorporates  dynamic  pricing.  Section  3.5  then 
considers  the  extra  work  required  by  the  two-objective  approach.  The  chapter 
concludes  with  the  development  of  techniques  to  reduce  the  additional  computation. 

3.2.  Column  Screening  in  the  Simplex  Method 

Although  motivated  by  feasible  direction  methods,  Lemmas  3  and  4  apply 
directly  to  the  simplex  method.  Instead  of  choosing  promising  variables  by  (2.30), 
we  formulate  a  two-priority  procedure  to  select  the  incoming  variable.  The  following 
criterion  helps  avoid  degenerate  pivots: 

d/t  —  0, 

dji  =  I(tj  .sso}  t  =  1, . . . ,  m; 

First  Priority:  s  =  argmin  c;, 

Cj<0,dj<0 

Second  Priority:  s  =  argmin  Zj.  (3.1) 

In  other  words,  select  a  variable  that  passes  the  screening  test  defined  by  Lemma 
3.  If  none  exist,  use  the  standard  selection  rule.  The  incoming  xiable  always  has 
a  negative  reduced  cost,  and  termination  occurs  only  when  all  nonbasic  variables 
have  nonnegative  reduced  costs.  Therefore,  assuming  it  uses  a  suitable  technique  to 
resolve  degeneracy,  the  simplex  method  will  obtain  am  optimal  solution  in  a  finite 
number  of  iterations. 

Dantzig,  Wolfe  and  Bland  (see  [6],  (51),  and  Shamir  [44])  proposed  pivot  rules 
to  handle  degeneracy.  The  intent  of  these  criteria  was  to  establish  finite  behavior  of 
the  simplex  method.  The  first  two  rules  use  a  very  specific  procedure  to  define  the 
outgoing  variable,  while  the  choice  of  incoming  variable  is  arbitrary  amongst  those 
with  negative  reduced  cost.  Bland’s  rule  explicitly  determines  both  the  incoming 
and  the  outgoing  variables.  The  pivot  rule  (3.1)  provides  no  guarantee  of  conver¬ 
gence  unless  accompanied  by  a  suitable  degeneracy  resolution  technique.  To  see 
this,  refer  to  Hoffmann’s  cycling  example  in  [6].  Nonetheless,  (3.1)  differs  from  the 


other  rules  because  it  deals  directly  with  degeneracy  during  the  selection  process. 
It  tries  to  avoid  problems  with  degeneracy  instead  of  resolving  them  after  their 
occurrence. 

Lemma  4  and  the  piecewise  linear  function  of  (2.29)  motivate  a  pivot  rule 
identical  to  (3.1).  The  only  difference  is  in  the  choice  of  d: 


&N  —  0, 


*  =  l,...,m; 


First  Priority:  s  =  argmin  Zj, 

Cj  <0j dj  <0 

Second  Priority:  s  =  argmin  Zj.  (3.2) 

This  rule  attempts  to  avoid  small  pivot  steps.  Once  again,  the  simplex  method 
obtains  an  optimal  solution  in  a  finite  number  of  iterations. 


3.3.  Estimating  the  Step  Length 

The  pivot  rules  of  the  preceding  section  use  a  second  reduced  cost  dj  to  avoid 
poor  choices  of  incoming  variable.  We  now  attempt  to  use  dj  to  select  variables  with 

large  step  lengths.  In  fact,  for  a  certain  choice  of  d,  a  direct  connection  between  d} 
and  the  step  length  dj  emerges. 

To  begin,  define  d  as  in  the  previous  pivot  rule  (3.2).  Consider  the  following 
three-tiered  pivot  rule: 

First  Priority:  s  =  argmin  Cj, 
cj  <o,5)  =o 

Second  Priority:  s  =  argmax  Zjdj, 

Cj  <0,dj  <0 

Cj 

Third  Priority:  s  =  argmin  —.  (3.3) 

cj  <0 ,dj  >o  dj 

What  motivates  such  a  rule?  The  value  of  dj  provides  information  on  the  com¬ 
ponents  of  A.j  that  correspond  in  the  simplex  method  ratio  test  (2.24)  to  basic 
variables  smaller  than  x/a.  Lemma  4  establishes  the  undesirability  of  nonbasic 
variables  with  positive  values  of  dj.  Larger  positive  values  are  even  worse  since 

they  imply  the  presence  of  either  more  positive  components  of  A.j  or  a  few  large 
positive  components.  Each  of  these  occurences  suggests  a  small  step  length.  If 
dj  >  0,  use  the  ratio  Cj/dj  to  incorporate  information  from  both  reduced  costs. 
This  approach  balances  the  good  aspects  of  more  negative  values  of  Zj  with  the 

unfavorable  aspects  of  large  positive  values  of  dj.  Similarly,  negative  values  of  dj 
suggest  the  prevalence  of  negative  values  in  the  components  of  A.j  involved  in  the 
ratio  test.  Only  positive  components  of  A.j  can  bound  6},  so  nonbasic  variables 
with  more  negative  values  of  dj  are  less  likely  to  have  small  step  lengths.  The  quan¬ 
tity  Zjdj  estimates  the  improvement  in  the  objective  function  if  Xj  enters  the  basis. 
Why  should  variables  with  dj  =  0  receive  top  priority?  The  particular  choice  of  d 


in  (3.2)  motivates  this  distinction.  Note  that  one  determines  dB  independently  of 
A.j.  Since  dj  =  dJA.y,  and  dji  =  0  if  Xj{  >  x/a,  one  anticipates  that  if  dj  =  0, 

then,  in  practice,  Aij  =  0  for  i  :  dJt.  >  0.  If  so,  none  of  these  smaller  basic  variables 
bounds  dj.  A  large  step  length  then  becomes  likely. 

The  previous  pivot  rule  attempts  to  estimate  the  step  length  associated  with 
Xj  based  on  the  value  of  dj.  Many  other  functions  of  the  basic  variables  besides  the 
piecewise  linear  one  of  (3.2)  and  (3.3)  may  yield  helpful  information  about  dj.  How 
does  one  determine  useful  functions?  Theorem  1  provides  insight  into  this  question 
by  establishing  a  direct  relation  between  dj  and  dj  for  a  suitable  choice  of  /. 

Theorem  1.  Assume  Xj(  >  0,  and  set  dN  =  0  and  dJt.  =  1/x^.  for  i  =  1 
For  some  j  &  N,  let 


Then, 


r  —  argmin  =—, 
i.Aij  >0  Ai,j 


E 

i-Aij*  0 
i*r 


Proof.  In  order  to  prove  the  theorem  we  show  that  d- 1  is  a  term  of  the  summation 

that  comprises  dj.  Note  that  r  identifies  the  component  of  the  basis  indexing  the 
variable  that  would  depart  the  basis  if  Xj  was  chosen  as  entering  variable.  With 
this  in  mind, 


■  xj< 
mm  =LL- 

i'-Aij  >0  A^j 

Ar.i 


Now,  compute  dj  by  its  definition  and  extract  dj1  from  the  resulting  summation: 


* — '  X  j. 
i=l  }> 


-  E 


•  Ajj^O 


=  dli  +  y' 

Xir  *rn 


i?r 


=  r+  E 


;  — 

i:Ai,j*0 

i*r 


The  last  equality  follows  from  (3.5).  • 


_ 


Notice  that  (3.5)  implies  that 


Therefore,  the  reciprocal  of  6j  contributes  the  largest  positive  element  to  the  sum 
comprising  dj.  One  can  vise  to  estimate  6j.  Smaller  values  of  6j  imply  larger 

values  of  its  reciprocal.  Once  again,  negative  values  of  suggest  relatively  large 
step  lengths.  Clearly,  the  term 


i*r 


may  drastically  distort  this  estimate.  Nonetheless,  in  practice  the  canonical  columns 
A.j  tend  to  be  fairly  sparse,  reducing  the  number  of  terms  in  the  summation  com¬ 
prising  7 j.  More  importantly,  clj  need  only  accurately  estimate  the  size  of  9j  relative 
to  the  step  lengths  associated  with  other  potential  entering  variables.  As  long  as 
the  values  of  7,-  remain  reasonably  well  behaved  across  all  nonbasic  variables, 
will  yield  useful  information  about  the  relative  sizes  of  the  step  lengths.  In  practice 
the  assumption  that  Xj{  >  0  is  unacceptable  since  virtually  all  practical  problems 
exhibit  degeneracy.  In  order  to  avoid  this  difficulty,  let  e  >  0  represent  a  suitably 
small  tolerance  and  set  dj.  =  e_1  if  Xjt  <  e.  We  can  now  formulate  a  pivot  rule 
similar  to  (3.3): 


First  Priority:  s  =  argmax  ZjClj, 

Cj  <0,dj  <0 

Second  Priority:  a  =  argmin 

Cj  <0,dj  >0  “7 


(3-6) 


One  can  view  this  rule  as  an  attempt  to  estimate  cheaply  the  prohibitively  expensive 
rule  of  maximizing  the  improvement  in  the  objective  function: 


s  =  argmin 
j.Cj  <0 


(3.7) 


This  procedure  requires  a  solution  of  a  system  of  equations  and  a  ratio  test  for 
each  nonbasic  variable  with  negative  reduced  cost.  Contrast  this  with  (3.6),  which 
requires  only  one  additional  solve.  One  could  propose  many  other  functions  to  define 
dB.  Regardless  of  the  particular  choice,  Theorem  1  reveals  the  essential  idea  behind 
it.  Explicit  computation  of  for  many  nonbasic  variables  is  hopelessly  expensive 
(in  a  sequential  computing  environment),  but  the  solution  of  a  single  system  of 
linear  equations  can  provide  an  inexpensive  estimate  of  its  value. 

In  [20]  Kalan  proposes  a  more  elaborate  version  of  (3.6)  involving  two  ex¬ 
tra  pricing  operations  instead  of  one.  Kalan’s  rule  should  provide  more  accurate 
information  about  selecting  a  good  entering  variable,  but  it  also  requires  more  com¬ 
putation  time  than  (3.6).  In  [47]  Todd  motivates  a  pivot  rule  similar  to  (3.6)  from 
the  framework  of  an  interior  method  for  linear  programming.  To  see  the  connection. 


note  that  the  components  of  dB  specified  in  Theorem  1  are  precisely  the  components 
of  the  gradient  of  the  logarithmic  barrier  function  lnx^,.. 


3.4.  Parametric  Variants  of  the  Simplex  Method 

The  pivot  rule  (3.1)  helps  the  simplex  method  avoid  degenerate  pivots.  One 
expects  this  rule  to  perform  well  on  highly  degenerate  problems.  We  now  consider 
other  pivot  rules  having  nice  properties  with  respect  to  degeneracy.  The  parametric 
simplex  method  proposed  by  Gass  and  Saaty  (see  [13])  provides  a  framework.  Refer 
to  Dantzig  [7]  for  additional  details.  The  parametric  method  does  not  select  columns 
in  order  to  avoid  degenerate  pivots,  but  it  makes  progress  reducing  dual  infeasibility 
even  when  a  decrease  in  the  primal  objective  value  is  stalled  by  degeneracy.  We 
consider  a  special  case  of  the  algorithm  of  Gaiss  and  Saaty.  We  provide  a  slightly 
different  proof  of  convergence  because  of  its  applicability  to  an  extension  of  the 
algorithm  that  incorporates  dynamic  pricing. 

Consider  a  lineau:  program  of  the  form  (1.1)  with  a  parametric  objective  function 
(cT  +  Q(F)x.  Assume  a  feaisible  basis  B0;  let  N0  index  the  corresponding  nonbaisic 
columns.  Initialize  the  parametric  cost  row  d  as  follows: 

d. b0  —  0, 

di  =  \\A-jh  for  j  G  N0.  (3.8) 

Actually,  we  only  require  that  dj  >  0  for  j  G  N0  to  prove  convergence.  However,  the 
particular  choice  (3.8)  ensures  that  the  forthcoming  pivot  rule  remains  invariant  un¬ 
der  column  scaling.  Associated  with  B0  are  the  current  values  of  the  duail  variables 
<  =  ctBqB~1  aind  the  current  reduced  costs  c£o  =  ctNq  —  nfA.No.  The  algorithm 
initializes  9  at  a  sufficiently  large  value  so  that  the  parametric  objective  function 
+  0dNo  >  0.  6  then  decreases  until  it  attains  some  value  91  where  a 
component  of  cNq(9 )  attains  zero.  In  other  words,  the  current  solution  is  optimal 
for  the  parameterized  linear  program  provided  that  9  >  91.  The  component  that 
equals  zero  identifies  the  entering  basic  variable.  A  ratio  test  defines  this  selection 
procedure.  Assume  no  ties  occur  during  this  test.  The  usual  simplex  method  ra¬ 
tio  test  then  determines  the  outgoing  baisic  variable;  one  can  break  ties  arbitrarily. 
Pivot  as  usual,  generating  a  new  parametric  objective  function  Zs,(e)  =  vNl+dNl9. 
To  calculate  the  parametric  reduced  costs  dSl ,  observe  that  d  is  a  second  objective 
vector.  Compute  erf  =  df,iBf1,  and  then  set  3^  =  dffi  —  ofA/fl.  Note  that  for 

the  initial  basis  J3„,  dj  —  dj.  We  now  repeat  the  pivot  procedure.  We  shall  see  that, 
provided  that  there  exists  a  unique  choice  of  incoming  variable,  9  decreases  strictly 
during  each  iteration.  The  parametric  objective  function  cN(9)  remains  nonnegative 
throughout  the  algorithm;  the  basis  is  optimal  for  the  linear  program  (1.1)  when  9 
attains  zero. 

Figure  3-1  summarizes  the  algorithm.  Theorem  2  establishes  convergence.  As¬ 
sume  an  optimal  solution  exists. 

Theorem  2.  Provided  there  exists  a  unique  choice  of  incoming  variable  during  ev¬ 
ery  iteration,  the  parametric  algorithm  of  Figure  3-1  determines  an  optimal  solution 
in  a  finite  number  of  iterations. 


Proof.  We  use  induction  to  show  that  the  sequence  of  parametric  values  9°,  91 , 
92,...  generated  by  the  algorithm  decreases  strictly  during  each  iteration.  This, 
combined  with  the  fact  that  each  basis  corresponds  to  a  unique  value  of  9,  assures 


Figure  3-1.  Summary  of  the  Parametric  Algorithm 

Given:  An  initial  feasible  basis  B  =  B0;  N  =  N0  indexes  the  corresponding  non- 
basic  columns. 

1.  Initialize  parametric  cost  row  d.  Set  dBo  —  0  and  dj  =  ||A.y||2  for  j  G  N0. 

2.  Set  6  sufficiently  large  so  that  Cy0(B)  =  cNo  +  dNod  >  0.  Note  that  dj  =  dj 
during  the  first  iteration. 

(Iterative  Loop) 

3.  Decrease  6  until 

3  s  :  Z,  +  6d,  =  0,  Zj  Odj  >  0  for  j  €  N/s.  (3.9) 

Assume  such  an  s  exists.  Use  the  following  ratio  test  to  determine  s: 

Z 

s  =  argmax  (3.10) 

i*j<  o  dj 

Let  0  be  the  corresponding  maximum.  If  6  —  0,  go  to  8. 

4.  Given  xa,  determine  the  outgoing  variable  xjT  by  the  usual  simplex  method 
ratio  test.  Ties  may  be  broken  arbitrarily.  If  the  test  reveals  an  unbounded 
solution,  go  to  9. 

5.  A.a  replaces  A.jr  as  the  rth  column  of  the  basis.  Update  the  current  feasible 
solution  as  in  tne  standard  simplex  method. 

6.  Calculate  Z  and  d  for  the  new  basis. 

7.  Go  to  3. 

8.  Optimal  solution  found. 

9.  Terminate  algorithm. 


convergence.  Consider  the  first  iteration.  Initially,  9  =  9°  is  such  that  ZNo+ddNo  >  0. 
We  then  determine 

8  —  max 

j:Cj  <0  dj 

Since  3y  =  dj  =  ||A.y||2  >  0,  it  follows  (see  Figure  3-1)  from  (3.7)  and  (3.8)  that 
8l  <  6°.  A  strict  decrease  occurs  during  the  first  iteration;  now  consider  iteration 

k  —  1.  Let  C*-1  and  3*  1  represent  the  reduced  costs  of  c  and  d  at  the  start  of 
iteration  Jfc  —  1.  Similarly,  let  a*-1  represent  element  (i,j)  of  the  matrix  A  in  the 
canonical  form  (1.2).  8k~1  represents  the  value  of  9  generated  by  the  ratio  test  (3.8) 
during  iteration  k  —  1.  By  the  induction  hypothesis,  6°  >  91  >  ...  >  9k~1 ,  and 


3  a  :  cj_1  +  dk~1dk'1  =0, 

zk-1+9k-1dkj'1  >0,  j  eN\s. 


(3.11) 


Note  that  xt  replaces  XjT  in  the  basis.  The  pivot  element  is  then  <Ikt  1  >  0.  Perform 
the  pivot  and  examine  the  resulting  reduced  costs  for  iteration  k: 

dt-i 

£*  —  j*-1  _  jy*-1  * _ • 

ci  cj  5TT ' 

r* 

dk  =  3*-1  -  a*-1  ^ — 

J  1  r>  a*-1 


Multiply  dk  by  9k  1  and  add  to  C* : 


ck  +  ek~ld)  =  c*_1  +  ek~ldkj  1  -  ^  (tf-1  +  *) 


=0  by  (3.11) 


=  c*"1 

J 


+  0*_13*  1  >0. 


The  last  inequality  follows  from  (3.11).  Since  C*  +  9k  *3*  >  0  and  8k  1  >  0,  it 
follows  that 

cj  <  0  =*  dk  >  0.  (3.12) 

Unless  the  current  basis  is  optimal,  (3.12)  guarantees  at  least  one  potential  pivot 
column  satisfying  (3.11).  Also,  observe  that  since  Cy  +  9k~1dj  >  0,  a  nonbasic 

column  a  such  that  Sg  >  0  and  3*  <  0  cannot  satisfy  (3.11)  for  9  <  9k~1.  Hence, 
we  only  need  consider  j  6  N  :  Cy  <  0  as  candidates  to  index  the  entering  variable. 
This  validates  the  ratio  test  (3.10).  We  can  now  show  that  9k  <  8k~1.  Since 
c)  -(-  9k~1dkj  >  0,  (3.12)  implies  the  existence  of  8k  and  a  such  that  8k  <  9k~1, 
c*  4-  9kdt  =  0,  and  C*  +  9kdi  >  0.  To  determine  9k  and  s,  we  choose  9k  as  the 
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Z*  +  0*d*  >  0.  To  determine  9k  and  i,  we  choose  6k  as  the  smallest  possible  value 
that  satisfies  the  condition 

Z*  +  0*3*  >0  for  j  :  zj  <  0,  3 *  >  0 
O  Zj  >  -03*  for  j  :  Zj  <  0 

c * 

&  0  >  -  -j  for  ;  :  zj  <  0 

rf  Z? 

<*=►  0  =  max  — and  J  =  argmax  — 

i:5><°  3"  j:c*  <o  3" 

Thus,  the  ratio  test  (3.10)  determines  i  and  0*  <  0*-1  <  . . .  <  01  <0°.  Remember 
that  we  have  assumed  no  ties  occur  in  this  test.  A  unique  value  of  0  corresponds 
to  each  basis  since  the  ratio  test  involves  Z  and  d.  Therefore,  a  basis  cannot  repeat 
itself  during  the  algorithm.  This  completes  the  inductive  proof.  • 

Theorem  2  assumes  that  no  ties  occur  during  the  ratio  test  (3.10).  A  random 
perturbation  of  the  initial  values  of  dNo  validates  this  assumption  witn  probability 
one. 

The  values  of  0  generated  during  the  parametric  algorithm  provide  a  measure 
of  the  level  of  dual  infeasibility.  Even  when  stalling  (a  long  sequence  of  degenerate 
pivots)  occurs  in  the  primal,  the  algorithm  progresses  in  the  dual.  This  suggests  that 
the  method  will  perform  well  in  the  presence  of  degeneracy.  The  screening  criterion 
(3.1)  works  in  the  primal;  it  tries  to  avoid  degenerate  pivots  and,  hence,  stalling 
by  using  a  second  objective  function  d.  The  parametric  algorithm  also  utilizes  a 
second  objective  function,  albeit  a  constant  one.  From  the  perspective  of  the  primal, 
however,  no  features  of  this  second  objective  appear  to  help  it  avoid  degenerate 
pivots.  As  we  shall  see,  (3.1)  reduces  the  percentage  of  degenerate  pivots,  while 
the  parametric  algorithm  does  not.  Nonetheless,  computational  tests  in  Chapter  4 
reveal  the  effectiveness  of  both  methods  on  highly  degenerate  problems. 

The  inability  of  the  parametric  algorithm  to  avoid  degenerate  pivots  suggests 
the  potential  of  a  variant  that  can  avoid  degenerate  pivots  while  still  decreasing  0 
during  each  iteration.  Unfortunately,  the  parametric  algorithm  lacks  the  freedom 
to  choose  the  incoming  variable.  0  need  not  decrease  unless  the  ratio  test  (3.10) 
determines  the  incoming  variable.  However,  provided  that  one  initializes  dj  >  0 
for  j  G  N0,  0  decreases  monotonically  under  the  parametric  pivot  rule.  Since  the 
algorithm  generates  a  sequence  of  feasible  bases,  one  can  reinitialize  the  parametric 
objective  at  any  iteration.  We  shall  use  this  fact  to  formulate  a  modified  paramet¬ 
ric  algorithm  that  screens  fo»‘  degenerate  pilots  without  sacrificing  the  monotonic 
decrease  of  0. 

Let  us  begin  by  defining  some  additional  notation.  Let  d1  and  dp  represent  the 

1  ”2 

parametric  and  dynamic  objectives,  respectively.  Let  d  and  d  be  the  corresponding 
reduced  costs.  Consider  any  iteration.  As  before,  s  indexes  the  entering  variable 
chosen  by  the  parametric  algorithm: 

Zj 

s  =  argmax—  (3.13) 

j:Cj  <0  dj 

Again,  assume  s  is  unique.  If  3,  >  0,  Lemma  3  implies  the  resulting  pivot  will  be 
degenerate.  Let  q  index  the  variable  that  maximizes  the  ratio  in  (3.13)  while  also 


passing  the  screening  criterion  of  (3.1): 


c, 

q  =  argraax  --p  (3.14) 

j:Cj  <0,3? <0 

Assume  that  during  all  previous  iterations  q  =  s.  In  other  words,  the  entering 
variable  was  always  a  valid  choice  under  the  parametric  algorithm;  hence  6  decreased 
during  each  iteration.  Suppose  at  the  current  iteration  q  ^  s.  Then,  since  q  does 
not  maximize  the  ratio  in  (3.13), 


(3.15) 


We  wish  to  reinitialize  the  parametric  objective  vector  so  that  q  indexes  a  legiti¬ 
mate  entering  column  with  respect  to  the  parametric  algorithm.  In  order  to  do  so, 
determine  6  >  0  such  that 


t  t-i 

The  following  lemmas  motivate  the  proper  selection  of  6. 
Lemma  5. 

z .31  —  cj1 


<  2\. 

^9 


Proof.  Suppose  the  contrary: 


-z 


>dl 


Remember  that  all  previous  pivots  were  valid  under  the  parametric  algorithm.  All 
properties  of  the  algorithm  remain  true,  so  by  (3.12),  ct  <  0  and  d3  >  0.  Subtracting 
dq  from  both  sides  of  the  last  inequality, 


C.2. 


-i 

f 


=>  - 


>  0 

>  0 


=>  —  <  o. 


But  da  >  0  and  <  0,  establishing  a  contradiction. 


Lemma  6. 


£j<??  -  Zqdt 


Z, 


>  0. 


Proof.  Since  a  /  q,  use  (3.15)  along  with  the  fact  that  3*  and  3*  are  positive: 

Z„ 


Zj  ^  '-j 


d\  >  d\ 


^  Cjd.  <  Zqda 


=»  Cgd1  —  cqd\  <  0  (recall  that  ca  <  0) 


+  >0. 


Z. 


Lemma  7. 


?.  3„  —  c„3l 

3  6  such  that 


d\-6 


(3.16) 


Proof.  Lemma  5  and  property  (3.12)  provide  the  proof.  By  Lemma  5, 


Ztd*  Cqda  <  d\ 

7  9 


Therefore, 


3  6  >  •• — -  such  that  d*  >  6. 

* 


(3.17) 


— 


Note  that  dq  —  6  >  0.  Multiplying  (3.17)  by  ?,  <  0  reveals  that 


ZtS  <  Zadq  Zqda 


=>  —  c,(d*  —  6)  <  —Cqd\  (recall  that  d\  >  0) 


Z, 


—  tT  < 


3;-« 


Given  these  lemmas,  Theorem  3  shows  how  to  reinitialize  the  parametric  ob¬ 


jective  vector  so  that  xq  enters  the  basis  instead  of  xs. 
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Theorem  3.  Assume  the  standard  parametric  ratio  test  (3.13)  determines  a  unique 
entering  variable  during  the  first  k  —  1  iterations  of  the  parametric  algorithm.  Sup¬ 
pose  that  q  ^  s  at  iteration  k,  and  the  parametric  objective  vector  d1  is  replaced 
by  the  following  vector  d3: 

4=  o 

d3=t-6 

d)  =  d)  for  j  6  N:  Zj  <  0,  j  ^  q 
4  =  d)  for  j  e  N:  Zj  >  0.  (3.18) 

Then  3  6  such  that  q  indexes  a  valid  choice  of  pivot  column  for  the  parametric 
algorithm. 

Proof.  To  prove  the  theorem,  we  must  show: 

(i)  that  d3  is  a  legitimate  choice  of  initial  vector  for  the  standard  parametric 
algorithm. 

( ii )  that  the  algorithm  will  select  xq  to  enter  the  basis,  i.e. 

q  ~  argmax  — 

j-.Zj  <0  dj 

(Hi)  that  the  value  of  the  parameter  9k  <  9k~l . 

Let  B  and  N  index  the  basic  and  nonbasic  variables  of  iteration  k.  Consider 
d3.  By  Lemmas  6  and  7,  3  6  >  0  satisfying  (3.16)  such  that  d3  =  d^  —  6  >  0.  By 

property  (3.12)  of  the  parametric  algorithm,  d]  >  0  for  j  :  Cj  <  0.  Furthermore, 
d]  =  ||A,||2  >  0.  Hence  d3  >  0  for  j  E  N,  so  d3  is  a  legitimate  initial  vector  for  the 
parametric  algorithm.  This  establishes  ( i ). 

Now  consider  the  new  parametric  reduced  costs  (f .  Since  d3  =  0,  erf  = 
(d3a)TB~1  =  0,  so  4  =  d3  —  o\ A.j  =  d3.  Using  Lemma  7  and  (3.18), 


Also,  since  s  maximized  the  ratio  for  the  standard  parametric  pivot  rule  (3.13), 


t 


> 


for  j  :  Cj  <  0,  j  ^  q. 


Combining  inequalities, 


In  other  words, 


q  —  argmax  — 


UliU - r- . 

!»<o  3* 


This  establishes  (it). 

All  that  remains  is  to  show  that  the  parameter  decreases  when  xq  enters  the 
basis.  Let  0 *  represent  the  value  of  the  parameter  at  iteration  k  determined  by  the 
normal  parametric  pivot  rule  (3.13).  In  other  words, 


~  tT* 


Similarly,  define  9q  as  the  parameter  associated  with  entering  xq  into  the  basis  after 
reinitializing  the  parametric  objective  row: 


<  =  =?s-. 

3i-< 


Using  Lemma  7,  we  see  that 


6=^'  q"‘  +  «, 


where  e  e  (0,7)  for  some  7  >  0.  Substituting  the  last  equation  into  the  previous 
one, 

gk  _  _ _ 

^9  “  ((?»^9  “  ?9^*)/?»)  ~  C 


c»cq 


(c?dj  -  ec,)/c,  7?  pi  -  c7,/7,) 


-=I— ^ —  >  oks. 

df  c7j /cq 


The  last  inequality  holds  since  9 J  =  —Za/H\.  Although  9q  >  9k  =  —?,/(?*,  note 
that  (9q  —9k)  can  be  made  arbitrarily  small  with  a  suitable  choice  of  e.  Recall  that 
9k  <  9k~1  because  all  previous  iterations  have  obeyed  the  normal  parametric  pivot 
rule.  It  follows  that  3  e  sufficiently  small  so  that  9k  <  9k~1.  Thus,  the  parameter 
decreases  by  a  positive  amount,  proving  (iii).  We  have  established  that,  with  a 
suitable  choice  of  e,  q  is  the  proper  column  selection  for  the  parametric  algorithm 
with  parametric  objective  function  d3.  • 

Given  the  results  of  Theorem  2,  Theorem  3  shows  that  the  modified  paramet¬ 
ric  algorithm  summarized  in  Figure  3-2  terminates  in  a  finite  number  of  iterations 


wm 


raw: 


Figure  3-2.  Summary  of  a  Modified  Parametric  Algorithm 

Given:  An  initial  feasible  basis  B  =  B0;  N  =  N0  indexes  the  corresponding  non- 
basic  columns. 

1.  Initialize  parametric  cost  row  d1.  Set  d1Bo  =  0  and  dj  =  H-A-jlb  for  j  €  N0- 

2.  Set  8  sufficiently  large  so  that  ~  +  ^*0^  '>  O’ 

(Iterative  Loop) 

3.  Determine  s  and  q  by  the  ratio  tests  (3.13)  and  (3.14).  If  8  =  0,  go  to  12. 

4.  If  s  =  q,  go  to  9. 

5.  Since  s  q,  reinitialize  d1.  Determine  1  >  0  such  that 


-c. 


d\  -  ecs/c. 


<8 


k-1 


6.  Set 


for  e  €  (0, 7)  so  that 


6  = 


_  ji  _  -ji 

_  ctdq  -  cqdt 


+  e 


d1q>6>^± 


7, it  —  Zal\ 


7.  Reinitialize  d1  as  follows: 


4-o 

d^dl-6 

4  *—  d*  for  j  :  Zj  <  0,  j  q 

4  *—  4  f°r  j  '■  >  o. 

8.  Variable  xq  enters  the  basis.  Go  to  10. 

9.  Variable  xa  enters  the  basis. 

10.  Determine  the  pivot  row  by  the  standard  simplex  method  ratio  test.  If  the  test 
reveals  an  unbounded  solution,  go  to  13. 

11.  Pivot  and  update  the  basis.  Update  the  current  feasible  solution.  Compute 

(J1,  and  ct*  for  the  new  basi  '  Go  to  3. 

12.  Optimal  solution  found. 

13.  Terminate  algorithm. 


if  accompanied  by  a  suitable  degeneracy  resolution  technique.  The  algorithm  ini¬ 
tializes  the  parametric  objective  vector  in  the  normal  way  and  proceeds  with  the 
standard  parametric  method  until  a  pivot  selection  3  fails  the  screening  criterion 
(3.1).  At  this  point  a  feasible  basis  exists,  so  we  reinitialize  the  parametric  vector 
so  tnat  xq  enters  the  basis  instead  of  xt.  Lemmas  5-7  and  Theorem  3  describe  the 
reinitialization  procedure  and  show  that  the  parameter  still  decreases. 

Note  that  the  modified  algorithm  requires  a  degeneracy  resolution  technique  to 
guarantee  convergence  while  the  original  one  does  not.  This  distinction  arises  be¬ 
cause  the  modified  algorithm  alters  the  parametric  objective  function,  whereas  the 
parametric  objective  of  the  original  one  remains  unchanged.  It  therefore  becomes 
conceivable  that  a  basis  could  repeat  itself  where  the  parametric  objective  has  dou¬ 
bled  and  the  parameter  has  halved  (see  [48]).  In  this  instance  the  parameter  would 
decrease  at  each  iteration  yet  never  attain  zero.  If  accompanied  by  a  degeneracy 
resolver,  the  modified  algorithm  will  terminate  in  a  finite  number  of  iterations  since 
it  always  selects  variables  with  negative  reduced  costs  to  enter  the  basis. 

Each  iteration  of  the  modified  parametric  algorithm  requires  computation  of 
two  extra  vectors  of  reduced  costs.  Also,  the  reinitialization  procedure  (3.18)  in¬ 
volves  some  more  work.  The  total  additional  computation  exceeds  that  of  any  of  the 
previously  described  pivot  rules.  Nonetheless,  the  approach  incorporates  the  bene¬ 
ficial  characteristics  of  the  parametric  method  and  pivot  rule  (3.1).  It  progresses  in 
the  dual  during  stalls  in  the  primal,  but  it  also  attempts  to  avoid  such  stalls. 

3.5.  Reduction  of  the  Additional  Computation 

All  of  the  previously  described  pivot  rules  compute  reduced  costs  on  a  second 
objective  vector  d.  In  the  context  of  the  revised  simplex  method,  one  must  determine 
dB ,  solve  an  extra  system  of  equations 

<7  TB  =  dra,  (3.19) 

and  then  calculate  reduced  costs  dj  for  j  :  Zj  <  0.  Although  not  prohibitively  expen¬ 
sive,  these  steps  comprise  a  significant  fraction  of  the  time  required  for  a  simplex 
method  iteration.  In  this  section  we  explore  techniques  to  reduce  the  additional 
work. 

An  opportunity  to  save  time  arises  during  the  solution  of  the  system  of  equa¬ 
tions  (3.19).  Notice  the  similarity  between  solving  for  <7  and  solving  for  the  dual 
variables  n: 

n  tB  =  ctb.  (3.20) 

One  should  solve  (3.19)  and  (3.20)  simultaneously.  One  could  call  a  subroutine 

tWICC! 

CALL  SOLVE(7r,cfl,...) 

< additional  code> 

CALL  SOLVE(o\ dB,.. .). 

Assuming  an  LU  factorization  represents  B,  each  call  involves  solving  the  linear 
system  wT B  =  zT ,  which  in  turn  requires  solving  the  linear  systems 


Thus  each  subroutine  call  must  access  the  array  containing  the  nonzeros  of  the 
lower  triangular  matrix  L  and  the  upper  triangular  matrix  U.  Instead,  suppose  one 
modifies  the  subroutine  so  it  computes  it  and  a  during  the  same  call: 

CALL  SOLVE(7t,<7,  cB,dB,. . .). 

This  approach  requires  only  one  access  of  the  array  containing  the  factorization  of 
the  basis  B  and  may  therefore  reduce  the  computation  time  involved. 

In  certain  situations,  the  extra  solve  (3.19)  becomes  unnecessary.  In  particular 
we  will  demonstrate  how  to  update  o  for  tne  pivot  rules  (3.1),  (3.2),  (3.3),  and  (3.6) 
provided  that  a  degenerate  pivot  occurred  during  the  previous  iteration.  We  shall 
also  see  that  one  can  always  skip  the  extra  solve  for  a  parametric  objective  vector. 
Again,  er  denotes  a  unit  column  vector  with  a  one  in  the  rth  component. 

Theorem  4.  Let  Bk,  irk,  and  ak  represent  the  basis,  dual  variables,  and  second 
objective  multipliers  during  iteration  k.  Let  cirt}  Cj,  and  d,  be  the  pivot  element, 
reduced  costs  and  second  objective  reduced  costs.  Suppose  that  the  second  objective 
vector  d  changes  by  only  one  component  after  iteration  k: 

dsk+l  =  dBk  +  per,  p  e  R1.  (3.21) 


Then 

_t  __  _t  ,  P  ~  °k  —  A.jr  )f_T  _T\  ( o  00\ 

ak+l  —  °k  +  - = - (^fc+l  —  nk  )■  (3.22) 

Cf 

Proof.  To  prove  the  theorem  we  exploit  the  similarity  between  the  linear  systems 
(3.19)  and  (3.20)  when  (3.21)  holds.  Note  that  <  0  and  art  >  0,  so  all  of  the 
quotients  formed  in  the  proof  remain  well  defined. 

Let  s  and  jr  index  the  incoming  and  outgoing  basic  variables,  respectively, 
during  iteration  k.  Then, 

Bk+1  =Bk  +  (A.,-A.Jr)eTr.  (3.23) 

Let  u  =  (A.,  —  A.jr),  and  suppose  v  solves  the  linear  system 

vTBk  =  tTr.  (3.24) 

In  oth°r  words,  vT  contains  the  rth  row  of  B^1.  Observe  from  (3.23)  and  (3.24) 
that 

Bk+1  =(I  +  uvT)Bk.  (3.25) 

Since  al+1Bk+i  =  drBh+x  and  dBk+l  =  dBk  +  per,  substituting  for  Bk+ 1  as  in  (3.25) 
implies  that 

<7*+ 1(/  +  uvT)B„  =  (TBk  +  ptTr 

by  (3.19)  by  (3.24) 

=  alBk  +pvTBk 

ak+i (I  +  UVT)  =  <xl  +  pvT .  (3.26) 


Now,  proceed  similarly  to  derive  an  analogous  expression  for  7r*+i.  First  of  all,  note 
that 


£♦> 

k 


r(4+i 


(3.27) 

(3.28) 

(3.29) 

(3.30) 


Since  *l+lBk+l  =  CB*+:  k  it  follows  from  (3.25)  that 

4+  li1  +  ™T)Bh  =  cTBk  +  (c,  -  cjr  )eTr 

=  *lBk  +(c4  -Cjr)vTBk 

=►  4+i(*  + ™>T)  =  4  +  (c.  -  c>)»T.  (3.27) 

Rearranging  (3.26)  and  (3.27), 

4+i  -<rk=(P-  4+ 1 “K.  (3.28) 

4+i  ~  4  =  ((c*  ~  cir)  ~  4+i u)vT-  (3.29) 

Substituting  (3.29)  into  (3.28), 

_T  _T  ...  / _ T  _T\  z-o  nrtN 

‘+,-  ‘  "((«.— :i.)-»r+l'‘)(  ‘+1  k)'  1  ] 

We  shall  later  verify  that  the  denominator  in  this  expression  cannot  equal  zero.  We 
must  now  derive  expressions  for  crj+1u  and  4+lu-  Begin  by  multiplying  both  sides 
of  (3.28)  by  u: 

4+iu  ~  4U  =  (P~  4+  i«)t>ru. 

Rearranging, 

crk+ i(u  +  uvTu)  =  pvTu  +  alu 
=>  <7fc+i«(l  +  vTu)  =  pvTu  + 

T  pvTu  +  crIu 

=►  4+ =  -7T— - Z-4—  (3.31) 

(1  +  vTu)  v  7 

Note  that  vTu  —  e^.B'j‘1(A.t  —  A.Jr)  =  Sr,  —  1,  so  the  denominator  in  (3.31)  is 
nonzero.  Proceed  similarly  to  derive  ?rj+1u: 

4+iu  ~  4U  =  ((c,  -  cjr )  -  nl+1u)vTu 
=►  4+lU(l  +  VTU)  =  (C,  -  Cjr)vTU  7 r*U 
_  (c,  —  C,  )fTU  +  TTtU 

=»  4+1u  =  -.r  r  .  *  ■  (3.32) 

*+1  (l  +  vru)  v  ’ 

Consider  the  denominator  in  (3.30).  Remember  that  s  and  jr  index  nonbasic  and 
basic  variables  respectively  at  the  start  of  iteration  k.  Substituting  the  value  of 
ttJ+  jU  from  (3.32)  and  regrouping  under  a  common  denominator, 


4+iu 


(3.31) 


=»  4+1°  = 


(3.32) 


simplify 


(C»  ~Cir)  “4+l«  = 


((c.  -  Cjr)(l  -I-  vru)  -  (c,  -  cJr)yTu)  -7 r£u 
(1  +  vTu) 


IV 

IV 

lv 

b? 

8 


(Ca-C,r)-4(^«-^ir) 
(1  +  vru) 


(C«  -4^-*)-(c>  zllAjsl 

(1 +  vTu) 

C, 

(1  4-  vTu ) 


(3.33) 


Thus,  the  denominator  is  indeed  nonzero.  Apply  the  same  logic  to  the  numerator 
of  (3.30)  by  using  (3.31): 


T  p(l  +  vTu)  —  (pvTu  4-  <7jfcU) 

p  -  <rfc+1u  -  (1+UTU) 

p-a[u 


(1  4-  vTu) ' 

Substituting  (3.33)  and  (3.34)  into  (3.30)  and  rearranging, 


(3.34) 


=  <ri  + 


T  ,  P  ak(A-s 


(*Ui  -  <)• 


The  implications  of  Theorem  4  depend  on  the  particular  choice  of  second  ob¬ 
jective.  For  any  of  the  pivot  rules  utilizing  dynamic  pricing,  the  result  reduces 
computation  time  whenever  a  degenerate  pivot  occurs.  In  this  case  the  values  of 
the  basic  variables  do  not  change  between  successive  iterations;  the  simplex  method 
merely  exchanges  two  variables  that  equal  zero.  Therefore,  dBk+1  =  dBk ,  which  im¬ 
plies  that  p  —  0.  Note  that  crkA.t  =  ds ,  and,  since  aT B  =  d£,  =  djr.  The 

result  of  Theorem  4  simplifies  to 


T  T 

°k+i  =  n 


(3,  -  djr) 


(**+i  -  **)• 


In  order  to  perform  the  computational  tests,  it  was  necessary  to  handle  bounded 
variables.  In  this  case  a  degenerate  pivot  may  occur  when  the  value  of  the  incoming 
basic  variable  differs  from  that  of  the  outgoing  one.  Hence,  dBfc+1  =  dBk  +per,  where 
p  ^  0;  Theorem  4  still  holds.  For  details  about  the  different  values  of  p  generated 
by  the  various  types  of  degenerate  pivots,  refer  to  the  appendix. 

Let  us  now  consider  parametric  algorithms.  In  this  case 

°lu  =  <rTkA.,  -  a\A.ir 

=  (~dt  +  <rjA.,)  +  ( djr  —  aTkA.ir )  —  djr 

-d.  o 

—  d,  —  ( ds  4  djr ). 

Also,  regardless  of  whether  a  degenerate  pivot  occurs, 

dB„+1  =  dBk  +  ( ds  -  djr)er, 

so  p  —  d„  —  djr .  Substituting  for  p  and  ak  u  into  the  result  of  Theorem  4  yields 


_r  _  _r  ,  /_t  _r\ 

^fc+i  -  ak  +  _  nk)- 

C» 

Therefore,  despite  the  existence  of  the  second  objective,  one  never  need  perform  an 
extra  solve  during  the  parametric  algorithm.  Instead  we  merely  update  the  vector 
cr.  The  same  conclusion  applies  to  any  variant  of  the  simplex  method  that  utilizes 
two  constant  objective  vectors  and  computes  a  pair  of  reduced  costs.  One  could 
also  avoid  the  extra  solve  by  maintaining  a  pair  of  vectors  for  the  reduced  costs, 
but  that  approach  requires  extra  storage  and  extra  array  references,  and  it  is  not 
amenable  to  partial  pricing. 


CHAPTER  4:  COMPUTATIONAL  RESULTS 


4.1.  Preliminaries 

The  author  tested  the  previously  discussed  pivot  rules  on  a  set  of  62  practical 
problems,  53  of  which  axe  publicly  available.  Problem  sizes  (excluding  slacks)  range 
from  small  (28  x  32)  to  large  (2263  x  9799).  The  simplex  method  solved  some 
of  these  problems  quite  efficiently  but  had  great  difficulty  with  others.  A  heuristic 
measure  of  its  performance  is  the  ratio  of  iterations  required  to  the  number  of  rows 
in  the  constraints.  If  this  ratio  exceeds  5.0,  one  can  consider  the  problem  difficult. 

The  problems  were  partitioned  into  different  sets  in  an  attempt  to  distinguish 
certain  characteristics.  The  KETRON  set,  the  only  proprietary  problems  tested, 
consists  of  nine  highly  degenerate  problems.  Degenerate  pivots  occurred  during  at 
least  30  percent  of  the  iterations  for  each  problem  when  solved  by  the  standard  sim¬ 
plex  method;  sometimes  the  percentage  exceeded  80.  The  PILOT  set  contains  four 
linear  programs  generated  by  variants  of  the  PILOT  model.  A  large-scale  economic 
model,  PILOT  uses  various  units  of  measurement  of  the  activity  levels  and  input- 
output  items  between  the  many  different  sectors  of  the  economy.  These  conversions 
of  units  have  resulted  in  notoriously  poor  scaling  of  the  constraints.  Although  all 
four  problems  arise  from  the  same  model,  examination  of  the  structure  of  each  prob¬ 
lem  (see  [27])  reveals  substantial  differences.  A  collection  of  14  staircase  problems 
comprises  the  STAIRCASE  set.  The  STANFORD  set  consists  of  the  12  problems 
used  in  Chapter  2  to  test  the  modified  feasible  direction  method.  The  fifth  group, 
labeled  the  SHIP  set,  contains  six  related  problems.  Unfortunately,  the  author  has 
no  details  on  them.  The  remaining  17  problems  form  the  MISCELLANEOUS  set. 
In  general,  these  problems  lack  any  known  c&tegorizable  features. 

As  with  the  tests  of  Chapter  2,  the  author  modified  MINOS  5.1  in  order  to 
implement  the  desired  pivot  rules.  The  primary  changes  occurred  in  the  pricing  rou¬ 
tine  that  determines  the  incoming  column.  All  other  aspects  of  the  simplex  method 
remained  unchanged.  Once  again,  one  can  attribute  distinctions  in  performance  to 
differences  in  the  pivot  rules,  not  inconsistencies  in  the  implementations. 

Figure  4-1  lists  the  test  sets  and  problem  sizes.  MINOS  contains  an  option  that 
scales  a  problem  before  commencing  the  simplex  method.  Since  scaling  may  dras¬ 
tically  alter  performance,  both  scaled  and  unsealed  problems  were  tested.  Partial 
pricing  was  not  used. 


4.2.  Screening  for  Degenerate  Pivots 

We  begin  with  the  results  for  pivot  rule  (3.1).  Recall  that  this  rule  screens 
columns  in  an  attempt  to  avoid  degenerate  pivots.  We  therefore  label  the  rule  as 
the  “Degeneracy  Screen”  in  the  following  figures.  Figure  4-2  displays  the  results 
for  unsealed  problems.  We  compare  both  iterations  and  CPU  time  with  the  stan¬ 
dard  simplex  method  of  MINOS  5.1.  Note  that  MINOS  5.1  contains  an  anti-cycling 
procedure  designed  to  both  prevent  cycling  and  improve  performance.  Iteration 
counts  reveal  if  the  information  provided  by  the  second  objective  function  is  use¬ 
ful;  the  times  determine  if  that  information  is  worth  the  extra  work.  Also  shown 
is  the  geometric  mean  of  the  resulting  ratios  for  each  test  set;  it  appears  in  the 
boxes  directly  below  each  test  set.  Assuming  all  problems  are  equally  important 
(perhaps  an  unrealistic  assumption  given  the  disparity  in  the  size  and  difficulty  of 
each  problem),  it  measures  relative  performance  of  the  new  pivot  rule  for  each  test 
set.  Values  less  than  1.0  imply  superior  performance  of  the  new  rule. 
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The  Degeneracy  Screen  performed  quite  well  on  the  KETRON  set.  This  is 
not  surprising,  given  that  the  rule  is  explicitly  designed  to  avoid  degenerate  pivots. 
It  substantially  reduced  both  iterations  and  time  on  most  of  these  problems.  It 
performed  exceptionally  well  on  the  larger,  more  difficult  problems.  The  new  pivot 
rule  also  succeeded  on  three  of  the  four  problems  in  the  PILOT  set.  PILOTS,  the 
unsuccessful  problem,  causes  only  3  percent  degenerate  pivots  when  solved  by  the 
standard  simplex  method.  It  is  therefore  not  surprising  that,  with  respect  to  time, 
the  Degeneracy  Screen  failed  to  outperform  the  regular  pivot  rule  on  that  problem. 
The  new  pivot  rule  consistently  reduced  the  iteration  counts  on  the  STAIRCASE 
set  as  well.  However,  on  many  problems,  the  decrease  in  iterations  didn’t  quite 
compensate  for  the  extra  work  per  iteration.  On  the  basis  of  CPU  time  the  two 
pivot  rules  performed  similarly  on  this  set.  Slightly  worse  results  occurred  with 
the  STANFORD  set,  as  only  a  slight  overall  reduction  in  iterations  occurred.  The 
STANFORD  set  includes  ISRAEL,  which  caused  the  most  trouble  for  the  new  rule. 
Test  results  on  the  SHIP  set  were  consistently  unfavorable  with  respect  to  both 
iterations  and  time.  One  should  note,  however,  that,  given  the  problem  sizes,  the 
standard  pivot  rule  performs  extremely  well  on  these  problems,  so  pivot  rule  (3.1) 
also  solves  them  quite  efficiently.  (3.1)  consistently  decreases  the  iteration  counts 
of  the  MISCELLANEOUS  set,  but  it  doesn’t  quite  break  even  in  terms  of  time. 
We  again  encounter  many  problems  where  time  increases  despite  a  decrease  in 
iterations.  Notice  that  it  performs  extremely  well  on  the  problem  FFFFF800. 

Since  the  Degeneracy  Screen  tries  to  avoid  degenerate  pivots,  it  is  interesting 
to  examine  the  level  of  degeneracy  in  each  test  problem.  Figure  4-3  contains  in¬ 
formation  on  the  frequency  of  degenerate  pivots  for  each  pivot  rule  on  unsealed 
problems.  The  new  pivot  rule  typically  reduces  the  frequency  of  such  blocked  piv¬ 
ots,  sometimes  quite  substantially  (see  NZFRI,  PILOTJA,  and  FFFFF800).  This 
suggests  that  part  of  its  success  derives  from  the  ability  to  avoid  unnecessary  piv¬ 
ots.  On  highly  degenerate  problems,  it  seems  likely  that  traversing  any  path  of 
vertices  to  an  optimal  solution  will  require  some  degenerate  pivots.  Indeed,  many 
practical  problems  contain  blocks  of  activities  for  which  dropping  a  key  activity  of 
a  block  implies  dropping  all  other  activities  in  the  block.  Nonetheless,  Figure  4-3 
suggests  that  many  such  pivots  performed  by  the  simplex  method  axe  unnecessary. 
Notice  also  that  the  pivot  rule  can  still  work  well  even  when  it  doesn’t  reduce  the 
percentage  of  blocked  pivots;  TUFF,  CYCLE  and  WOODW  provide  examples  of 
this  behavior. 

Figure  4-4  contains  comparisons  of  the  same  two  pivot  rules  for  scaled  prob¬ 
lems.  Relative  performance  remains  virtually  unchanged  for  each  test  set.  Very  few 
individual  problems  show  significant  differences;  SCSD8  and  FFFFF800  provide  two 
exceptions.  The  PILOT  set  is  particularly  noteworthy,  since  scaling  dramatically 
improves  the  performance  of  the  standard  pivot  rule.  Nonetheless,  the  Degeneracy 
Screen  still  results  in  significant  improvement  when  applied  to  the  scaled  problems. 
As  before,  it  typically  reduces  the  frequency  of  blocked  pivots;  refer  to  Figure  4-5 
for  details. 


4.3.  Screening  for  Small  Step  Lengths 


Figure  4-6  contains  results  for  pivot  rule  (3.2)  applied  to  unsealed  problems. 
Recall  that  this  rule  attempts  to  exclude  potential  incoming  variables  that  would 
result  in  small  step  lengths  (hence  the  label  “Small  Screen”  in  Figures  4-6  and  4-7). 
It  performs  fairly  well  on  the  KETRON  set,  yielding  a  moderate  overall  improve¬ 
ment  in  times.  It  does  not  perform  as  well  on  these  highly  degenerate  problems  as 
the  Degeneracy  Screen.  This  is  to  be  expected  since  this  rule  sacrifices  the  ability  to 
avoid  zero  step  lengths  in  order  to  gain  additional  information  about  positive  ones. 
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However,  this  rule  works  extremely  well  on  the  PILOT  set,  achieving  tremendous 
reductions  of  both  iterations  and  time  on  three  of  the  four  problems.  On  PILOTS, 
the  fourth  problem,  it  reduces  iterations  but  marginally  increases  time.  This  is  more 
than  offset  by  its  performance  on  the  other  three  problems,  particularly  PILOT J A. 
The  approach  does  not  work  as  well  on  the  STAIRCASE  and  STANFORD  sets, 
as  it  marginally  reduces  iterations  but  typically  increases  time.  The  SHIP  set  once 
again  proves  difficult,  although  the  new  pivot  rule  still  performs  well  given  the  prob¬ 
lem  sizes.  Iterations  for  the  MISCELLANEOUS  set  typically  decrease,  but  these 
reductions  frequently  fail  to  compensate  for  the  extra  computation.  Recall  that 
pivot  rules  (3.2)  and  (3.3)  involve  computation  of  the  mean  of  the  basic  variables. 
This  comprises  a  significant  portion  of  the  extra  work,  especially  with  respect  to  the 
bounded  variable  format  of  MINOS.  Given  the  experimental  nature  of  these  tests, 
the  author  computed  the  exact  value  of  the  mean  during  each  iteration.  In  prac¬ 
tice,  one  could  approximate  the  mean  using  a  variety  of  computationally  cheaper 
approaches,  thus  reducing  the  times  significantly. 

Figure  4-7  contains  results  for  pivot  rule  (3.2)  on  scaled  problems.  On  most 
sets  scaling  marginally  improves  its  performance  relative  to  the  usual  pivot  rule. 
On  the  PILOT  set,  relative  performance  declines  substantially,  primarily  because 
scaling  improves  the  standard  rule  so  dramatically.  On  average  the  new  pivot  rule 
still  reduces  iterations,  but  it  now  causes  a  significant  increase  in  time  for  two  of 
the  four  problems. 


4.4.  Piecewise  Linear  Estimation  of  the  Step  Length 

Figure  4-8  displays  results  for  pivot  rule  (3.3)  on  unsealed  problems.  Recall 
that  (3.3)  is  a  three-priority  pivot  rule  that  uses  a  piecewise  linear  function  to  es¬ 
timate  the  step  length  associated  with  a  potential  incoming  variable.  We  therefore 
use  the  abbreviation  “PLSE”  in  Figures  4-8  and  4-9.  The  approach  generally  suc¬ 
ceeds  on  the  KETRON  set,  except  for  the  problem  CYCLE.  Great  improvement  in 
NZFRI  and  DEGEN3  outweighs  this  bad  problem.  The  rule  executes  well  on  the 
PILOT  set,  dramatically  reducing  iterations  and  time  on  three  of  these  very  difficult 
problems.  Frequent  reductions  in  iterations  only  occasionally  reduce  CPU  time  for 
the  STAIRCASE  and  STANFORD  sets.  The  SHIP  set  continues  to  stymie  all  of  the 
new  pivot  rules,  as,  once  again,  both  iteration  counts  and  times  exceed  those  of  the 
usual  rule.  Results  are  mixed  on  the  MISCELLANEOUS  set.  GROW22  emerges 
as  the  worst  problem  encountered  for  this  rule.  On  the  positive  side,  it  is  encour¬ 
aging  to  note  improvement  on  CZPROB.  The  ratio  of  iterations  to  rows  for  the 
normal  rule  on  CZPROB  is  about  2.0,  and  very  few  degenerate  pivots  occur.  Thus, 
pivot  rule  (3.3)  enhances  performance  even  though  the  standard  method  handled 
the  problem  quite  effectively. 

Figure  4-9  contains  the  results  of  pivot  rule  (3.3)  for  scaled  problems.  Except 
for  the  PILOT  set,  scaling  has  very  little  effect  on  relative  performance.  With 
respect  to  the  PILOT  set,  the  new  pivot  rule  still  outperforms  the  standard  rule 
on  average.  The  improvement  is  less  dramatic  than  on  unsealed  problems,  but  it  is 
nonetheless  noteworthy  given  the  benefits  of  scaling  for  the  usual  rule. 


4.5.  Nonlinear  Estimation  of  the  Step  Length 

Figure  4-10  displays  results  for  pivot  rule  (3.6)  on  unsealed  problems.  (3.6)  uti¬ 
lizes  a  nonlinear  function  to  estimate  step  lengths.  Positive  results  for  the  KETRON 
and  PILOT  sets  resemble  those  for  pivot  rule  (3.3).  In  general,  (3.6)  does  not  per¬ 
form  well  on  the  STAIRCASE  set;  SCSD8  is  particularly  discouraging.  It  yields  a 


slight  overall  reduction  in  iterations  for  the  STANFORD  set,  but  times  typically  in¬ 
crease.  Like  the  other  pivot  rules,  it  performs  unfavorably  on  the  SHIP  set.  Results 
vary  drastically  on  the  MISCELLANEOUS  set.  It  performs  quite  well  on  CZPROB 
and  NESM.  The  improvement  for  NESM  is  encouraging  since  the  other  new  rules 
failed  to  reduce  CPU  time.  However,  it  performs  quite  poorly  on  80BAU3B,  as  it 
more  than  doubles  the  iterations  and  triples  the  time.  This  is  the  only  example 
where  one  of  the  new  rules  increased  the  ratio  of  iterations  to  rows  to  above  5.0. 

Figure  4-11  summarizes  the  performance  of  (3.6)  on  scaled  problems.  Only 
minor  differences  from  the  ur.ccaled  results  occur.  Even  the  PILOT  set  reveals  only 
a  moderate  decline  in  relative  efficiency. 

4.6.  Parametric  Method 

Figure  4-12  shows  results  for  the  standard  parametric  algorithm  outlined  in 
Figure  3-1.  The  algorithm  performs  quite  well  on  the  KETRON  set.  This  confirms 
our  intuition  since  the  algorithm’s  purpose  is  to  make  progress  in  the  dual  even 
when  stalled  in  the  primal.  The  results  for  NZFRI  are  particularly  encouraging. 
Notice  also  that  the  parametric  algorithm  requires  less  additional  work  per  iteration 
than  the  previously  tested  rules.  The  algorithm  exhibits  tremendous  success  on 
the  PILOT  set,  as  it  solves  each  problem  at  least  twice  as  quickly.  Notice  that  it 
solved  PILOT J A  more  than  eight  times  faster.  This  performance  is  not  particularly 
surprising  since  the  parametric  algorithm’s  choice  of  incoming  variable  remains 
invariant  under  column  scaling.  Given  the  poor  scaling  present  in  these  problems, 
one  might  anticipate  the  benefit  of  a  unit-free  pivot  rule.  Nonetheless,  we  shall 
see  that  the  rule  still  performs  well  when  MINOS  scales  these  problems.  Mixed 
results  characterize  the  STANFORD  set,  as  we  see  good  performances  on  CAPRI 
and  E226  accompanied  by  disappointing  ones  for  BRANDY  and  ETAMACRO. 
Results  for  the  SHIP  set  strongly  resemble  those  for  the  other  pivot  rules.  None  of 
the  two-objective  strategies  seems  to  work  well.  Performance  varies  drastically  on 
the  MISCELLANEOUS  set.  The  rule  does  quite  well  on  STANDATA,  VTPBASE 
and  FFFFF800.  However,  an  alarming  trend  emerges  for  the  problems  GROW7, 
GROW15,  and  GROW22.  The  same  model  generates  each  of  these  problems;  only 
the  number  of  time  periods  changes.  Performance  worsens  as  size  increases.  The 
parametric  method  requires  over  six  times  more  CPU  time  to  solve  GROW22,  a 
figure  far  beyond  the  worst  problems  for  any  of  the  other  pivot  rules.  Nonetheless, 
except  for  these  three  problems,  the  algorithm  doesn’t  increase  CPU  time  by  more 
than  fifty  percent  and  works  quite  well  on  most  of  the  larger,  more  difficult  problems. 

Figure  4-13  contains  results  for  the  standard  parametric  algorithm  on  scaled 
problems.  Although  overall  relative  performance  declines  compared  to  the  results 
without  scaling,  the  approach  still  does  well  on  most  of  the  larger  problems.  With 
respect  to  the  KETRON  set,  scaling  had  little  influence  on  eight  of  the  nine  prob¬ 
lems,  and  relative  performance  remained  quite  favorable.  However,  on  WOODW 
the  parametric  algorithm  required  much  more  time  than  the  simplex  method,  as  the 
ratio  of  CPU  times  exceeded  five.  Contrast  this  with  the  unsealed  results,  where 
the  parametric  algorithm  solved  this  problem  substantially  faster  than  the  simplex 
method.  Nonetheless,  considering  all  problems  equally,  the  parametric  algorithm 
performs  favorably  on  this  set.  As  for  the  PILOT  set,  relative  performance  declines 
substantially  compared  to  unsealed  results,  but  the  algorithm  still  outperforms  the 
simplex  method  on  all  four  problems.  Scaling  results  in  only  minor  differences  on 
the  remaining  test  sets.  Note  that  the  parametric  algorithm  processes  GROW15 
and  GROW22  much  more  effectively  when  scaled,  although  it  still  requires  more 
time  than  the  simplex  method. 

Since  the  parametric  algorithm  is  designed  particularly  for  degenerate  prob¬ 
lems,  we  again  examine  the  frequency  of  degenerate  pivots.  Figures  4-14  and  4-15 


show  the  results  for  unsealed  and  scaled  problems.  Unlike  the  Degeneracy  Screen, 
it  does  not  usually  reduce  the  frequency  of  blocked  pivots.  Consider  the  prob¬ 
lem  DEGEN3.  The  algorithm  dramatically  outdoes  the  simplex  method,  yet  the 
percentage  of  blocked  pivots  increases  slightly.  On  DEGEN2  it  outperforms  the 
simplex  method  despite  a  significant  increase  in  the  frequency;  one  encounters  sim¬ 
ilar  results  for  SEBA  and  80BAU3B.  This  characteristic  motivated  the  modified 
parametric  algorithm  of  Figure  3-2. 

4.7.  Summary 

Summarizing  the  results,  the  Degeneracy  Screen  appears  to  be  the  best  of  the 
pivot  rules  that  utilize  a  dynamic  second  objective  vector.  It  decreases  iteration 
counts  on  the  vast  majority  of  problems  tested.  Even  when  iterations  increase, 
times  almost  never  exceed  those  of  the  standard  simplex  method  by  a  factor  greater 
than  1.3.  Also,  the  instances  where  relative  performance  was  poorest  consisted 
of  small  to  moderately  sized  problems  that  the  standard  method  solved  quite  effi¬ 
ciently.  This  contrasts  with  the  new  rule’s  ability  to  substantially  reduce  iterations 
and  time  on  highly  degenerate  problems.  Many  of  these  are  the  large,  difficult  prob¬ 
lems  that  require  large  amounts  of  CPU  time  when  solved  by  the  regular  method. 
The  Degeneracy  Screen  decreases  the  susceptibility  of  the  simplex  method  to  such 
disasters,  and  it  does  so  at  minimal  risk. 

The  main  drawback  of  this  rule  is  that  it  is  unlikely  to  do  well  on  problems  with 
few  blocked  pivots  like  CZPROB  and  PILOTS.  Pivot  rules  (3.2),  (3.3),  and  (3.6) 
attempt  to  alleviate  this  problem.  They  can  succeed  in  solving  fairly  nondegenerate 
linear  programs  quickly,  and  they  perform  extremely  well  on  the  PILOT  set,  another 
group  of  very  difficult,  time  consuming  problems.  Unfortunately,  as  the  potential  for 
savings  improves,  so  does  the  possibility  of  substantial  increases  in  time.  The  worst 
relative  performances  may  occur  on  larger  problems,  and  the  CPU  time  may  exceed 
1.5  that  of  the  standard  rule.  In  rare  cases  the  rules  require  twice  as  much  time. 
Despite  the  risks,  these  rules  still  make  the  simplex  method  less  prone  to  disaster 
since  they  work  well  on  most  of  the  large,  difficult  problems.  Similar  conclusions 
arise  for  the  parametric  algorithm,  but  the  variation  in  performance  is  much  greater. 
The  algorithm  exhibits  the  ability  to  improve  or  worsen  times  by  factors  greater 
than  six.  The  risk  increases,  but  so  does  the  payoff. 
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Figure  4-1.  Test  Problem  Sizes 
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Figure  4-2.  RmuHi  for  (3.1)  on  Unacatod  ProtoJama 


Figure  4-2 (ctd).  ReeuRs  (or  (3.1)  on  UnecalM  Probleme 
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Figure  4-4(ctd).  Result*  for  (3.1)  on  Sealed  Problem* 
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Figure  4-5(cW).  Blocked  Pivot*  for  (3.1)  on  Sealed  Problem* 
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Figure  4-6.  Raaulta  for  (3.2)  on  Unacalad  Probtoma 
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Figure  4-8.  Results  tor  (3.3)  on  Unsealed  Problems 


Figure  4-8(ctd).  Results  ter  (3.3)  on  Unseated  Problems 
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Figure  4*10.  Rooulta  for  (3.6)  on  Unacolod  Proto  lama 
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Flgura  4-lO(ctd).  Results  (or  (3.6)  on  Unscalod  Problems 
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Figure  4-11.  RmuIIi  for  <3.6)  on  Scalod  Problem* 
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Figure  4-1  l(ctd).  Results  for  (3.6)  on  Scaled  Problem* 
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Figure  4-12.  Results  for  Parametric  Algorithm  on  Unsealed  Problems 
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CHAPTER  5:  EXTENSIONS  AND  FUTURE  RESEARCH 


This  chapter  briefly  considers  some  untested  ideas  motivated  by  Chapters  2 
and  3.  Potential  for  improvement  in  feasible  direction  methods,  dynamic  pricing, 
and  parametric  variants  of  the  simplex  method  still  remains.  None  of  the  ideas  is 
developed  in  detail,  but  the  mathematics  of  the  previous  chapters  should  provide  a 
foundation  for  future  work. 

5.1.  Feasible  Direction  Methods 

Much  flexibility  remains  for  the  algorithms  of  Chapter  2.  In  particular,  there 
exist  criteria  for  selecting  promising  variables  other  than  (2.13)  or  (2.30).  However, 
the  results  of  Chapter  2  suggest  that  any  good  criterion  must  account  for  the  sparsity 
and  degeneracy  present  in  practical  problems.  One  need  not  use  two  objectives  to 
handle  degeneracy  explicitly.  For  example,  one  could  apply  the  approach  of  Chang 
and  Murty’s  Gravitational  Method  (see  [4]).  Consider  the  linear  program 

...  t 
minimize  c  x 

subject  to  Ax  >  b. 

Assume  a  feasible  solution  x,  and  define 

J(x)  =  {i  :  Ai.x  =  bi}. 

J(x)  indexes  the  tight  constraints  corresponding  to  x.  If  J(x)  =  0,  x  lies  in  the 
strict  interior  of  the  feasible  region,  so  — c  provides  a  descent  direction  that  permits  a 
positive  step  length.  Otherwise,  use  the  tight  constraints  to  formulate  the  following 
direction-finding  problem: 


minimize  cTy 
subject  to  AJ{x).y  >  0, 
i  -  yTy  >  o. 


The  nonlinear  constraint  ensures  boundedness  of  the  problem’s  feasible  region.  Its 
optimal  solution  provides  a  descent  direction  with  a  positive  step  length.  The 
Gravitational  Method  does  not  maintain  basic  and  nonbasic  variables,  but  one  could 
modify  the  direction-finding  problem  and  apply  it  to  the  algorithms  of  Chapter  2. 
Specifically,  consider  the  canonical  form  (1.2)  and  a  (not  necessarily  basic)  feasible 
solution  x.  Let 


J(xB)  —  '  Xji  —  0}. 

The  direction-finding  problem  becomes 

minimize  c^y 
subject  to  AJ(zg)kNy  <  0, 
1  -  yTy  >  0. 
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The  optimal  solution  of  this  problem  yields  a  descent  direction  with  positive  step 
length.  This  is  extremely  important  given  the  susceptibility  of  feasible  direction 
methods  to  degeneracy.  One  also  sees  that  the  approach  generalizes  to  positive 
basic  variables  with  a  suitable  redefinition  of  J(xB).  The  additional  work  required 
to  solve  the  direction-finding  problem  may  inhibit  this  approach.  Nonetheless,  it 
illustrates  another  way  to  alleviate  problems  due  to  degeneracy. 

All  of  the  computational  tests  involving  feasible  direction  methods  involve  only 
a  single  column  exchange  in  the  basis  during  each  iteration.  Although  this  tac¬ 
tic  guarantees  a  descent  direction,  it  may  result  in  some  basic  variables  having 
substantially  smaller  values  than  nonbasic  ones.  Remember  that  the  one-to-one 
correspondence  between  vertices  and  bases  has  vanished.  One  can  associate  any 
basis  with  any  feasible  solution,  but  the  benefit  of  the  descent  directions  generated 
by  each  basis  may  vary  dramatically.  Therefore,  one  wishes  to  associate  the  feasible 
iterate  with  a  “good”  choice  of  basis.  Single  column  exchanges  limit  the  choice  of 
basis,  so  the  development  of  computationally  efficient  techniques  to  exchange  mul¬ 
tiple  columns  during  a  single  iteration  emerges  as  an  '  nportant  topic.  The  Box 
Method  of  Cottle  and  Zikan  (see  [54])  provides  insight  into  this  problem. 

5.2.  Dynamic  Pricing 


The  dynamic  pricing  criteria  of  Chapter  3  all  display  the  ability  to  reduce 
the  iterations  required  by  the  simplex  method  to  solve  practical  linear  programs. 
However,  we  encounter  many  cases  where  the  reduction  in  iterations  fails  to  reduce 
computation  time.  The  need  to  develop  additional  techniques  to  reduce  the  extra 
work  becomes  apparent.  Theorem  4  actually  takes  advantage  of  a  degenerate  pivot 
to  reduce  the  computational  effort  in  a  multiple-priority  pivot  rule.  Refer  to  [38] 
and  [39]  for  other  ways  to  exploit  degeneracy  in  the  simplex  method.  Mathematical 
results  such  as  Theorem  4  should  help,  but  one  must  also  realize  that  MINOS  was 
developed  for  the  single-objective  format  of  the  standard  simplex  method.  The 
implementations  tested  here  conformed  to  that  design.  Additional  modifications  in 
the  structure  of  MINOS  may  improve  the  performance  of  multiple-objective  pivot 
rules. 

Another  approach  to  reducing  computational  time  involves  techniques  to  ap¬ 
proximate  the  second  objective  reduced  costs  dj.  All  of  the  implementations  com¬ 
pute  the  quantities  a  and  dj  explicitly  during  each  iteration.  Exploration  of  cheaper 
techniques  to  approximate  these  quantities  may  lighten  the  computational  load. 

The  encouraging  performances  of  these  new  rules  on  difficult  practical  prob¬ 
lems  does  not  necessarily  suggest  good  worst -case  behavior.  None  of  the  patho¬ 
logical  problems  constructed  (see  [3],  [17],  [19],  [24],  [25]  and  [53])  to  demonstrate 
worst-case  behavior  of  established  pivot  rules  for  the  simplex  method  relies  on  de¬ 
generacy.  Furthermore,  procedures  such  as  (3.2),  (3.3)  and  (3.6)  attempt  to  emulate 
inexpensively  the  maximum-improvement  pivot  rule  (3.7).  Jeroslow  [19]  established 
exponential  worst-case  behavior  of  that  rule.  It  therefore  seems  unlikely  that  re¬ 
search  into  this  area  for  the  pivot  rules  of  Chapter  3  will  provide  any  significant 
results. 

Finally,  application  of  a  second  objective  to  certain  nonlinear  programming 
algorithms  also  merits  investigation.  Dynamic  pricing  criteria  significantly  improved 
the  performance  of  the  feasible  direction  methods  of  Chapter  2.  This  suggests  the 
promise  of  using  dynamic  pricing  when  solving  nonlinear  programs  by  the  reduced- 
gradient  method.  Lemke’s  algorithm  may  also  benefit  from  this  technique.  Lemke’s 
algorithm  is  a  pivotal  algorithm  that  can  optimize  certain  quadratic  programs  in 
addition  to  linear  programs  (see  [41]).  It  solves  the  following  linear  complementarity 
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problem: 


lx  -  My  =  q, 

z,j/,  =  0,  i  =  (5.1) 

x,y  >  0. 

Assuming  nondegeneracy,  the  determination  of  a  covering  vector  to  initiate  the 
algorithm  uniquely  determines  the  sequence  of  pivot  steps.  However,  in  practice 
some  flexibility  in  the  pivoting  procedure  may  arise  in  the  presence  of  degeneracy. 
Theoretically  one  must  appeal  to  the  perturbation  techniques  used  in  the  simplex 
method  to  resolve  such  ambiguities.  In  practice  it  may  be  possible  to  break  ties 
arbitrarily  without  encountering  a  cycle;  this  is  almost  always  true  with  respect 
to  the  simplex  method.  If  it  also  holds  for  Lemke’s  algorithm,  perhaps  one  can 
exploit  the  flexibility  arising  from  degeneracy.  In  certain  cases,  Lemke’s  algorithm 
may  not  find  a  solution  to  the  linear  complementarity  problem  even  though  one 
exists;  it  can  terminate  on  a  ray.  Freedom  in  selection  of  the  entering  variable 
permits  development  of  pricing  criteria  designed  to  avoid  such  an  occurrence.  In 
particular,  let  M ,}  represent  a  potential  incoming  column  relative  to  the  current 
basis.  Termination  on  a  ray  occurs  when  M.j  <  0.  Therefore,  setting  dB  =  1, 
dN  =  0  and  computing  3,  =  d^M.j  =  <jt  M.j  provides  a  measure  of  the  likelihood 
of  termination  on  a  ray  if  a  particular  variable  enters  the  basis.  When  freedom  to 
select  the  entering  variable  arises,  one  could  utilize  dj  to  choose  the  variable  least 
likely  to  result  in  termination  on  a  ray. 

5.3.  Parametric  Algorithms 

The  parametric  algorithm  of  Figure  3-1  specifies  an  initial  parametric  objective 
vector.  This  is  merely  one  of  many  possible  legitimate  initial  vectors.  Research 
into  this  area  may  reveal  new  initialization  procedures  that  further  enhance  the 
performance  of  the  parametric  algorithm. 

Parametric  variants  of  the  simplex  method  strongly  resemble  Lemke’s  algo¬ 
rithm  applied  to  linear  programs.  In  particular  (see  [28]),  solving  a  linear  program 
by  Lemke’s  algorithm  is  equivalent  to  solving  it  by  Dantzig’s  self-dual  parametr  <'  al¬ 
gorithm.  The  selection  of  the  parametric  objective  corresponds  to  the  initialization 
of  the  covering  vector.  Given  this  relation,  research  into  the  proper  selection  of  the 
parametric  objective  vector  should  provide  insight  about  good  initial  covering  vec¬ 
tors  for  Lemke’s  algorithm.  Furthermore,  the  potential  for  the  modified  parametric 
algorithm  of  Figure  3-2  to  avoid  degenerate  pivots  suggests  an  analogous  variant  of 
Lemke’s  algorithm  in  which  the  covering  vector  changes  during  the  course  of  the 
algorithm  in  a  way  designed  to  decrease  the  number  of  iterations. 


CHAPTER  6:  SUMMARY  AND  CONCLUSIONS 


What  conclusions  emerge  from  this  work?  We  have  seen  that  sparsity  and 
degeneracy  can  inhibit  the  progress  of  the  reduced-gradient  variants  described  in 
Chapter  2.  Utilization  of  a  dynamic  second  objective  function  helps  avoid  these 
obstacles  and  significantly  improves  the  performance  of  such  algorithms.  Although 
the  modified  algorithm  tested  here  slightly  outperformed  the  simplex  method  with 
respect  to  iterations,  it  failed  to  compete  in  terms  of  computation  time.  The  notion 
of  using  a  reduced-gradient  approach  to  solve  linear  programs  has  existed  for  many 
years,  but  very  few  computational  tests  have  appeared  in  the  literature.  Chapter 
2  provides  new  insight  into  this  topic  by  identifying  essential  problems  with  the 
method,  describing  techniques  to  elude  these  difficulties,  and  establishing  some 
computational  results.  Although  still  not  competitive  with  the  simplex  method, 
the  results  achieved  here  substantially  improve  upon  the  status  quo.  Additional 
improvements  in  this  type  of  algorithm  may  remain,  but  they  must  account  for  the 
sparsity  and  degeneracy  present  in  practical  problems. 

The  most  important  aspect  of  Chapter  2  is  the  motivation  of  the  use  of  two 
objectives  to  gain  useful  information  about  nonbasic  variables.  Pricing  out  to  avoid 
degenerate  pivots  has  become  a  computationally  feasible  procedure.  The  idea  gen¬ 
eralizes  to  allow  pricing  out  to  avoid  small  pivot  steps.  Chapter  2  shows  that  these 
concepts  enhance  the  performance  of  variants  of  the  reduced- gradient  method. 

Chapter  3  addresses  applications  to  the  simplex  method.  Additional  extensions 
of  the  two-objective  approach  focus  on  estimating  the  step  length  associated  with  a 
nonbasic  variable.  Explicit  computation  of  the  exact  step  length  for  each  potential 
incoming  nonbasic  variable  is  prohibitively  expensive.  Dynamic  pricing  provides 
one  inexpensive  way  to  estimate  such  step  lengths. 

The  computational  tests  in  Chapter  4  show  that  any  of  the  pivot  rules  using 
dynamic  pricing  outperform  the  standard  simplex  method  with  regard  to  iterations. 
Given  the  increasing  emphasis  on  parallel  computing,  this  result  is  useful  by  itself. 
The  results  are  less  clear  with  respect  to  computation  time,  as  reductions  in  itera¬ 
tions  frequently  do  not  outweigh  the  increased  work  per  iteration.  The  Degeneracy 
Screen  (pivot  rule  (3.1))  remains  a  safe  rule,  as  CPU  times  stay  close  to  those  of  the 
standard  simplex  method  for  even  the  worst  problems.  Furthermore,  it  consistently 
performs  well  on  highly  degenerate  problems.  Certain  applications  tend  to  generate 
highly  degenerate  problems  (for  example,  see  [11]).  Hence,  one  can  probably  an¬ 
ticipate  cases  where  (3.1)  will  always  work  well.  The  other  pivot  rules  display  the 
capacity  to  perform  well  on  a  wider  variety  of  problems,  but  this  greater  potential 
is  accompanied  by  increased  risk  of  poor  performance.  Nonetheless,  all  of  the  dy¬ 
namic  pivot  rules  perform  well  on  the  majority  of  the  large,  difficult  problems  that 
consume  the  most  computer  time. 

The  final  conclusion  from  this  thesis  is  that  despite  40  years  of  intensive  re¬ 
search,  potential  for  improvement  remains  in  the  simplex  method  as  well  as  other 
linear  programming  algorithms.  Many  researchers  tend  to  assume  that  basic  re¬ 
search  in  linear  programming  has  been  exhausted.  The  results  here  suggest  that  we 
do  not  yet  fully  understand  the  simplex  method,  let  alone  other  linear  programming 
algorithms.  Uncharted  territory  remains  to  be  explored. 
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CHAPTER  7:  APPENDIX 


This  chapter  elaborates  on  the  specifics  of  the  computational  tests  of  Chapters  2 
and  4.  Certain  aspects  of  the  tests  involve  generalizations  of  the  main  results  of  this 
thesis.  Inclusion  of  these  details  earlier  would  only  have  obscured  the  important 
ideas.  One  need  not  read  this  chapter  if  only  interested  in  the  major  ideas;  no 
new  mathematics  will  appear  here.  However,  any  reader  interested  in  bridging  the 
gap  between  theory  and  practice  present  in  many  implementations  may  find  this 
appendix  useful. 

Section  1  develops  a  procedure  to  determine  an  optimal  vertex  from  an  op¬ 
timal  solution.  There  exist  several  different  ways  to  find  such  a  vertex,  including 
a  procedure  in  MINOS  based  on  the  simplex  method.  The  method  defined  below 
strongly  resembles  the  feasible  direction  methods  of  Chapter  2.  It  enables  those 
algorithms  to  produce  an  optimal  vertex  instead  of  just  an  optimal  solution.  Sec¬ 
tion  2  then  extends  the  theory  of  Chapters  2  and  3  to  handle  the  bounded  variable 
logic  of  MINOS.  None  of  the  changes  involves  any  significant  additional  mathemat¬ 
ics.  Nonetheless,  they  provide  useful  insight  into  important  aspects  of  a  large-scale 
linear  programming  code. 

7.1.  A  Procedure  to  Find  an  Optimal  Vertex  From  an  Optimal  Solution 

The  algorithms  of  Chapter  2  find  an  optimal  solution  to  the  linear  program 
(1.1).  However,  given  the  interest  in  sensitivity  analysis,  an  optimal  vertex  becomes 
desirable.  The  following  procedure  determines  such  a  vertex. 

Irrespective  of  the  particular  choice  of  promising  variables,  the  feasible  direction 
method  terminates  with  an  optimal  solution  x *  when  the  set  P,  defined  in  (2.13),  is 
empty.  Let  B  and  N  index  the  nonbasic  variables  associated  with  x*  at  termination. 
Let 

I  =  {j  e  N  :  cj  =  0,  Xj  >  0}. 

Since  P  =  0,  I  indexes  the  positive  nonbasic  variables  of  r*.  If  /  =  0,  x*  is  the 
optimal  vertex  corresponding  to  B.  If  I  ^  0,  x*  is  not  a  vertex.  In  order  to  identify 
an  optimal  basis,  decrease  the  variables  in  I  simultaneously  down  to  zero.  Let  e 
represent  a  column  vector  of  ones.  In  the  framework  of  Chapter  2,  set  qP  =  q,  =  —  e, 
and  perform  an  iteration  of  the  reduced-gradient  method.  The  resulting  solution 
remains  optimal  since  Z}  =  0  for  j  £  /.  During  this  process  individual  variables 
in  I  either  attain  zero  or  drive  a  basic  variable  to  zero.  In  the  latter  case  a  pivot 
occurs.  Each  iteration  decreases  |/|  by  at  least  1,  so  the  procedure  terminates  with 
an  optimal  vertex  in  at  most  |/|  iterations. 

We  now  consider  an  iteration  in  detail.  We  wish  to  decrease  each  variable  in  I 
without  violating  the  equality  constraints  Ax  —  6.  In  order  to  do  so,  define  q  €  Rn 
by 

q,  =  -e 

q*u  =  0 

Bqg  —  —A,q,. 

Update  x*  by  the  relation 

x*+-x*  +  0q.  (7.1) 

How  large  can  0  be?  We  wish  to  make  0  as  large  as  possible  while  maintaining 
nonnegativity  of  x*g  and  x).  In  other  words, 


x*  +  8q  >  0 
=>  Oqj  >  x*  j  = 

X* 

=>  8  —  min - -. 

]eBuI:qj<0  qj 


i 

k  =  argmin - -. 

j£BUl:qj<0  qj 

If  Jt  6  /,  a  variable  in  I  hits  zero  without  driving  a  basic  variable  below  zero.  A 
pivot  is  unnecessary.  Update  x*  by  (7.11  and  proceed  with  another  iteration.  Note 
that  x?  —  0,  so  |/|  has  decreased  by  at  least  1.  Now,  suppose  k  £  B.  Then  a  basic 
variable  hits  zero  before  any  variable  in  I  does.  Let  r  index  the  component  of  B 
containing  k.  Define 

/=  {>  eI:Ar,j<  0}. 

I  indexes  variables  in  I  that  can  drive  the  rth  basic  variable  to  zero.  Since  k  £  B 
and 

x* 

8  <  min - -  <  -foo, 

j£I:q,<  o  qj 

the  ratio  test  (7.2)  implies  that  J  ^  0.  One  can  then  choose 

s  =  argmaxx^ 


to  index  the  incoming  variable.  Then  Ar|S  <  0  and  s  £  I.  The  resulting  pivot 
maintains  nonsingularity  of  the  basis.  Update  x*  by  (7.1).  Since  s  6  /  enters  the 
basis  index  and  x*  =  0,  |/|  decreases  by  at  least  one.  If  |/|  =  0,  terminate  with  an 
optimal  vertex.  Otherwise,  start  the  next  iteration. 

Since  this  procedure  decreases  j/|  by  at  least  1  during  each  iteration,  it  wall  drive 
all  positive  nonbasic  variables  to  zero  within  |/|  iterations.  Some  flexibility  exists 
within  the  pivoting  strategies.  In  practice,  however,  the  procedure  was  unnecessary 
for  the  majority  of  the  problems  tested,  and  it  never  required  more  than  three  pivots 
before  termination.  As  a  result,  extensive  experimentation  was  never  pursued. 
Note  that  the  computation  times  of  Figure  2-3  include  any  time  required  by  this 
procedure. 


7.2.  Extensions  to  Bounded  Variables 

For  expository  purposes  we  have  considered  linear  programs  of  the  form  (1.1). 
AUhough  -me  can  express  any  linear  program  in  this  form,  the  transformations  re 
qiiired  to  do  so  frequently  involve  extra  variables  and  ronst.r  huts,  which  would  remit 


VVV«*  V  *«*  V  *,’»  V  "  ,**  /  V  "j-  V  ’„-**•*  *»■  V  ’/  */  ", 
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in  computational  inefficiencies.  In  practice  one  wishes  to  process  linear  programs 
in  their  most  general  form: 


I 


t 


r 


¥ 


minimize  c  y 
subject  to  AEy  —  bE 
AGy  ^  ba 
ALy  <  bL 
1  <y  <  u. 


(7.3) 


Any  linear  program  fits  (7.3)  without  any  transformations.  Define  s  =  (sB,sa,sL) 
as  slacks  on  the  constraints  of  (7.3).  Rewrite  (7.3)  as 


minimize  c  y 

subject  to  AEy  +  sE  —  bE  =  0 
Aay  ~  ( bG  +  iG)  =  0 
ALy  +  sL  -bL  =  0 
/  <!  ti,  sE  =  0,  sG,st  ^  0. 


Let 


Sjg  —  sE  bE 
5G  —  ( bG  +  5G  ) 

3  l  —  3 1  ~bL. 


Then  (7.4)  is  equivalent  to 


minimize  c  y 
subject  to  AEy  +  sE  =  0 
Aay  +  -Sc  =  0 
ALy  +  sL  =  0 
l  <y  <  u, 

bE  ~3e  —  bEf 

-oo  <sG  <  -6C, 

—  b,  <S,  <  +  OO. 


Letting 


A  = 


x  = (y,sE,sa,sL), 
l  =  (l ,  ~b e ,  —  oo,  —bL), 
u  =  (?/,  - bE ,  —6,3,  +oc), 


i  i 


(7.4) 
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(7.4)  becomes 

minimize  cTx 

subject  to  Ax  =  0  (7.5) 

/  <x  <  u. 

MINOS  processes  the  general  linear  program  (7.3)  into  the  form  (7.5).  Only  the 
bounds  on  the  variables  distinguish  (7.5)  from  the  standard  form  (1.1).  Therefore, 
the  remainder  of  the  appendix  focuses  on  generalizing  the  results  of  Chapters  9  and 
3  to  handle  bounded  variables. 

Bounded  variables  generate  some  minor  modifications  to  the  feasible  direction 
methods  of  Chapter  2.  First  of  all,  consider  a  search  direction  q  and  an  updated 
solution  x  =  x  -f  6q  as  in  (2.2).  Equation  (2.4)  ensures  that  variables  in  P  do  not 
violate  the  nonnegativity  constraints  of  the  standard  form  linear  program  (1.1). 
We  must  now  guarantee  that  such  variables  do  not  violate  the  bound  constraints  of 

(7.5) .  In  other  words,  for  j  €  P , 

lj  <  Xj  +  dqj  <Uj 

<=>  6qj  <  Uj  —  Xj  and 

6qj  >  lj  -  Xj 

XL  '  —  X 

<*=>■  8  <  — - -  for  j  :  qj  >  0  and 


<7> 

h  ~  xi 


O  0  <  min  \  min 


for  j  :  qj  <0 
lj  %  j 

m  «  t-i  _ r_ 


min 
;6  Pq,. 


Uj  -  x,  -j 

un  — - -  ) . 

■qj  >0  qj  J 


Similar  logic  applies  to  the  basic  variables  in  ratio  test  (2.5),  the  two-column  linear 
program  (2.12),  and  the  pivoting  tactics  outlined  by  (2.15-2.20). 

Bounded  variables  also  influence  the  choice  of  promising  variables.  (2.13)  now 
becomes 

:  a  p-.t  <  0  and  XJ  <  u;>  or 

^  (  Cj  >  0  and  x;  >  l} . 

The  modified  promising  criterion  (2.30)  and  pivot  rules  (3.1),  (3.2),  (3.3)  and  (3.6) 
all  utilize  reduced  costs  computed  from  a  second  objective  vector  d.  Bounded 
variables  affect  the  definition  of  d.  Recall  the  choice  of  dB  defined  by  Lemma  3.  For 
i  —  1, . . .  m,  set 

dj.  =  (  1  if  xii  =  °> 

'  1 0  otherwise. 

This  particular  choice  of  d  identifies  potential  incoming  variables  that  would  cause 
a  degenerate  pivot.  In  the  presence  of  bounded  variables,  degeneracy  occurs  when 
a  basic  variable  resides  at  one  of  its  bounds.  Also,  instead  of  increasing  from  zero, 
a  potential  incoming  variable  Xj  may  increase  from  its  lower  bound  L  or  decrease 
from  its  upper  bound  it,.  Let  us  examine  these  two  cases  separately.  First,  suppose 
c}  <  0,  so  Xj  increases  from  l}.  A  degenerate  pivot  occurs  if 

3  i  :  A,  ,  >  0,  x.  =  /,  or 

(7.G) 

A, j  <  0,  —  Uj i . 


Therefore,  for  t  =  1, ...  m,  set 


1  if  xj.  =  lj. , 

-1  if  Xj.=uu,  (7.7) 

0  otherwise. 


If  2j  =  d^A.j  =  aTA.j  >  0,  exclude  Xj  from  consideration  in  order  to  avoid  a 
degenerate  pivot.  If  Zj  >  0,  a  degenerate  pivot  occurs  if 


3  i  :  AtJ  <  0,  xj.  =  lj.  or 

Ai,j  >  0,  Xj.  =  Uj.. 


(7.8) 


(7.8)  is  the  opposite  of  (7.6),  so  set  dB  as  in  (7.7),  but  exclude  Xj  from  consideration 
if  <  0.  Thus,  the  extension  to  bounded  variables  still  only  requires  a  single  solve 
of  the  form  (2.26)  for  the  vector  a. 

Equation  (7.7)  describes  how  to  determine  dB  in  the  presence  of  degenerate 
basic  variables.  This  suffices  for  pivot  rule  (3.1),  but  the  criteria  (2.30),  (3.2),  (3.3) 
and  (3.6)  consider  positive  basic  variables  as  well.  In  such  cases  the  value  of  a  basic 
variable  Xj.  no  longer  determines  the  value  of  dj. .  Instead,  consider  the  distance  z, 
of  Xj.  to  its  closest  bound: 

z,  =  min  {xj,  -  lj ,  uj(  -  xj{ }  %  =  1, . . .  m. 

The  values  of  z,  determine  the  values  of  d,,. .  So,  for  example,  assuming  that  z,'  >  0, 
one  utilizes  the  following  choice  of  d  to  implement  pivot  rule  (3.6): 


f  1/z,  if  Xj.  —  lj.  <  Uj.  —  x j{ 
\  — 1/z.  if  Xjt  —  l}>  >  Uji  -  Xjt 


i  =  1, . . .  m. 


(7.9) 


In  practice,  set  z*  to  t  >  0  if  Z«  <  €,  where  e  represents  an  appropriately  defined 
tolerance.  Note  that  if  Cj  >  0,  Xj  decreases  if  it  enters  the  basis.  In  that  case  one 

modifies  pivot  rule  (3.6)  so  that  nonnegative  values  of  dj,  instead  of  nonpositive 
ones,  distinguish  nonbasic  variables  with  top  priority.  The  same  approach  applies 
for  pivot  rules  (3.2)  and  (3.3). 

Theorem  4  illustrates  an  updating  procedure  for  a  provided  a  degenerate  pivot 
occurred  during  the  previous  iteration  of  the  simplex  method.  Since  the  bounded 
variable  form  (7.5)  redefines  the  notion  of  degeneracy,  the  value  of  p  in  the  update 
(3.22)  depends  on  the  specifics  of  the  pivot.  In  addition,  MINOS  5.1  permits  nonba¬ 
sic  slack  variables  with  negative  lower  bounds  and  positive  upper  bounds  to  reside 
at  any  value  between  the  bounds.  This  tactic  attempts  to  improve  the  stability 
of  the  feasible  iterates.  It  also  aids  recovery  from  singular  bases  and  helps  with 
restarts.  This  feature  generates  two  additional  types  of  degenerate  pivot.  A  total 
of  six  possible  types  of  degenerate  pivot  can  occur.  Each  one  is  listed  below,  along 
with  the  corresponding  value  of  p.  Remember  that,  regardless  of  the  specific  case, 
a  degenerate  pivot  during  the  kth  iteration  implies  that 

dB|k+1  =  dBk  +  per. 


Let  .$  and  jr  index  the  incoming  and  outgoing  variables  during  the  degenerate  pivot 
of  iteration  k. 


Case  1.  Suppose  z,  =  la,  and  Xjr  =  l  jT.  Both  the  incoming  and  outgoing  variables 
reside  at  their  lower  bounds.  Therefore,  dBk+l(r)  =  dBk(r),  so  p  =  0. 

Case  2.  Suppose  xt  =  la,  and  Xjr  =  Ujr.  The  entering  variable  resides  at  its  lower 
bound,  while  the  outgoing  variable  resides  at  its  upper  bound.  In  this  case  (7.7)  or 
(7.9)  implies  that  dBi+l(r)  =  -dBk(r),  so  p  =  -2 dBk(r). 

Case  3.  Suppose  x4  =  u„  and  Xjv  =  uJr .  This  is  analogous  to  Case  1,  and  p  =  0. 
Case  4.  Suppose  xt  =  u4,  and  Xjr  =  ljr.  This  is  analogous  to  Case  2,  and  p  = 

~2dBiir). 

Case  5.  Suppose  la  <  xa  <  tt4,  and  Xjr  =  lj  .  This  case  can  arise  when  the 
incoming  variable  is  a  slack  variable  initialized  between  its  bounds.  Let  7  = 
/(min  {1 ,  —  /4, u4  —  r4}),  where  /  is  the  function  that  determines  the  basic  compo¬ 
nents  of  d  (recall  (2.27)).  Then,  dBk+l(r)  =  7  =  dBk(r)  +  p,  so  p  =  7  -  dBi(r). 

Case  6.  Suppose  /4  <  xt  <  u4,  and  Xjr  =  Ujr.  This  is  analogous  to  Case  5,  so 

p  =  7  —  dgk{  r). 

The  parametric  algorithm  also  uses  a  second  objective  vector  d ,  albeit  a  con¬ 
stant  one.  Although  the  generalization  of  Theorem  4  becomes  unnecessary,  bounded 
variables  affect  the  pivoting  procedure.  Recall  that  the  algorithm  sets  d}  =  |] [(2 
for  j  £  N0  in  order  to  ensure  optimality  with  respect  to  the  parametric  objective 
function  cNg(0)  =  ZSq  -f  0dNo  >  0.  Remember  that  the  bounded  variable  form  (7.5) 
permits  nonbasic  variables  to  reside  at  their  upper  bounds.  As  a  result,  if  a  nonbasic 
variable  1  j  has  positive  reduced  cost  and  equals  its  upper  bound,  use  —d}  instead 

of  dj  in  the  denominator  of  the  ratio  test  (3.10). 

This  concludes  the  appendix.  Since  the  author  did  not  wish  to  burden  the 
reader  with  every  equation  in  the  bounded  variable  format,  many  equations  influ¬ 
enced  by  the  change  in  form  have  been  omitted.  However,  we  have  considered  all 
of  the  different  ways  bounded  variables  affect  the  mathematics  of  the  thesis.  The 
reader  can  use  the  results  of  this  chapter  to  derive  the  remaining  relations. 


I 
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ABSTRACT 


In  recent  years  the  interest  in  linear  programming  algorithms  has  increased 
greatly  due  to  the  discovery  of  new  interior-point  methods.  New  results  have  also 
prompted  researchers  to  reconsider  some  previously  discarded  ideas  in  light  of  their 
additional  knowledge.  This  thesis  begins  with  a  study  of  some  variants  of  the 
reduced-gradient  method  applied  to  linear  programs.  Preliminary  computational 
tests  revealed  how  sparsity  and  degeneracy,  two  characteristics  present  in  most 
practical  problems,  can  severely  inhibit  such  variants.  The  development  of  dynamic 
pricing  criteria  to  exclude  certain  columns  from  the  search  direction  provides  a  com¬ 
putationally  efficient  way  to  alleviate  these  difficulties.  Application  to  the  simplex 
method  yields  a  pivot  rule  designed  to  avoid  degenerate  pivots.  Generalization  of 
the  rule  yields  a  cheap  method  to  estimate  the  step  lengths  associated  with  potential 
incoming  nonbasic  variables.  The  result  is  a  set  of  pivot  rules  that  appear  partic¬ 
ularly  useful  on  highly  degenerate  and  poorly  scaled  linear  programs.  Extensive 
computational  tests  are  presented. 
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